# Pandas - DataFrame
---

### Overview: 
Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real-world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis/manipulation tool available in any language

The two primary data structures of pandas, Series (1-dimensional) and DataFrame (2-dimensional), handle the vast majority of typical use cases in finance, statistics, social science, and many areas of engineering. 

#### Website: https://pandas.pydata.org/docs/user_guide/index.html

#### Advantages

- Tabular data with heterogeneously-typed columns, as in an SQL table or Excel spreadsheet

- Ordered and unordered (not necessarily fixed-frequency) time series data.

- Arbitrary matrix data (homogeneously typed or heterogeneous) with row and column labels

- Any other form of observational / statistical data sets. The data need not be labeled at all to be placed into a pandas data structure

# Numpy - Array
---

### Overview: 
Fast and versatile, the NumPy vectorization, indexing, and broadcasting concepts are the de-facto standards of array computing today.

NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.

At the core of the NumPy package, is the ndarray object. This encapsulates n-dimensional arrays of homogeneous data types, with many operations being performed in compiled code for performance

#### Website: https://numpy.org/ | https://numpy.org/doc/stable/user/index.html#user

# Seaborn - Statistical Data Visualization
---

### Overview:
Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.

#### Website: https://seaborn.pydata.org/

# SKLearn - Machine Learning
---

### Overview
Scikit-learn is an open source machine learning library that supports supervised and unsupervised learning. It also provides various tools for model fitting, data preprocessing, model selection, model evaluation, and many other utilities. 

Scikit-learn, now better known as SKLearn, is a machine learning package for Python. It's name arguably comes from SciPy Toolkit, meaning in addition on top of the popular package SciPy.

SKLearn is built on NumPy, SciPy, and matplotlib. This last statement has two major implications. First, SKLearn is very fast and efficient, and second, which applies to us, it often prefers working with arrays.

#### Website: https://scikit-learn.org/stable/ | https://scikit-learn.org/stable/user_guide.html

#### Advantages

- First, it boasts incredible documentation. Whatever doubts you may have about how it works, SKLearn has it on its website, usually with example applications. This can be a great supplementary resource for you while you are learning to work with this package.


- Second, its variety. In terms of machine learning, SKLearn is definitely the leading package right now. 
 - regression - ML
 - classification - ML
 - clustering - ML
 - support vector machines
 - dimensionality reduction
 - famously numerically stable

#### Disadvantages

The only weak spot in its range is deep learning. TensorFlow, Keras, and PyTorch are much better alternatives in that case.