Skip to content

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

Here are 15,957 public repositories matching this topic...

ogrisel commented Nov 13, 2020

Most functions in scipy.linalg functions (e.g. svd, qr, eig, eigh, pinv, pinv2 ...) have a default kwarg check_finite=True that we typically leave to the default value in scikit-learn.

As we already validate the input data for most estimators in scikit-learn, this check is redundant and can cause significant overhead, especially at predict / transform time. We should probably a


Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • Updated Oct 1, 2020
  • Python
s-rog commented Nov 30, 2020

There are some formatting issues with the arguments (the incorrect blocks and apostrophes). I've yet to go through the whole metrics docs, noting them here as I find them for when I get a chance to fix them or if anyone wants to help!


wetneb commented Nov 26, 2020

The options "Include Schema" and "Include Contents" in the SQL exporter dialog can be a bit mysterious for users.

Proposed solution

Just like concrete SQL commands are included in the UI text elsewhere in the dialog ("DROP"), we could expand these phrases to mention the corresponding SQL commands ("CREATE TABLE", "INSERT").

Alternatives considered

We could also drop the mention

You can’t perform that action at this time.