## 298. Introduction
- NumPy is a great library to work with homogeneous numeric data, which uses integer based indexing, but it is not a great library to handle the Big Data today
- Big Data today needs a data structures that can be easily customized
- Big Data comes in mixed types and can have missing data to be handled
- Also, we need various functions, mathematical operations that need to be applied to Big Data, that is where `Pandas` come in
- The word `Panda` is derived from `Panel Data`
- Examples are stock prices, players' scores across matches, students' grades across exams, and so on
        NumPy                   Pandas
        Numeric                 Custom
        Integer Indexing        Mixed/Missing
                                Manipulation
- Pandas are classified into
    1. `Series`
        - to handle one-dimensional data
    2. `DataFrames`
        - to handle two-dimensional data
- Pandas use arrays behind the scenes, and they're very closely related to NumPy library
- Several NumPy library functions acts up `Series` and `DataFrames` as functional arguments, so that you can use Pandas with NumPy libaray as well
- Both `Series` and `DataFrames` will allow us to easily select and manipulate the data
- we can apply functions like `map` `reduce` right out of the box
- we can perform various mathematical operations on Big Data
- Also, we can visualize the data in differen formats
- All this is in-built into `Pandas`

## 299. Series
- A `Series` is an enhanced one-dimensional array
- while arrays use `zero-based indexing` which is numeric, `Series` support `custom indexing` like strings
- `Series` also handle missing data, as many functions in `NumPy` ignore the missing data
- we can create a `Series` using a `list`, `numpy.ndarray`, `map`, etc.
- The default index in `Series` is a numeric value which starts from zero, but we can customize it
- We're going to create several Series of your own and explore different functions on `Series` like `count()`, `mean()`, `min()`, `max()`, `std()`, `describe()`, and more

## 300. Create Project
- To install `Pandas` from the commandline, you've to execute
``` python
pip3 install pandas
```
- This will install `Pandas` for your Python environment

In [10]:
# pandas

## 301. Create and use Series
- We'll start exporing Pandas Series
- `pandas.Series(data=None, index=None, dtype='Dtype|None'=None, name=None, copy='bool|None'=None, fastpath='bool|lib:NoDefault'=<no_default>, )`
    - One-dimensional ndarray with axis-labels (including time series)
    - `data` : array-like, Iterable, dict, or scalar value
        - Contains data stored in Series. If data is a dict, argument order is maintained.
    - `index` : array-like or Index (1d)
        - Values must be hashable and have the same length as `data`.
        - Non-unique index values are allowed. Will default to RangeIndex (0, 1, 2, ..., n) if not provided.
        - If data is dict-like and index is None, then the keys in the data are used as the index.
        - If the index is not None, the resulting Series is reindexed with the index values.
    - `dtype `: str, numpy.dtype, or ExtensionDtype, optional
        - Data type for the output Series. If not specified, this will be inferred from `data`.
    - `name` : Hashable, default None
        - The name to give to the Series.
    - `copy` : bool, default False
        - Copy input data. Only affects Series or 1d ndarray input
- `pandas.Series.count()`
    - return the number of Non-NA/null observations in the Series
- `pandas.Series.mean(axis:'Axis|None'=0, skipna:'bool'=True, numeric_only:'bool'=False, **kwargs, )`
    - return the mean of the values over the requested axis
- `pandas.Series.min(axis:'Axis|None'=0, skipna:'bool'=True, numeric_only:'bool'=False, **kwargs, )`
    - return the minimum of the values over the requested axis
- `pandas.Series.max(axis:'Axis|None'=0, skipna:'bool'=True, numeric_only:'bool'=False, **kwargs, )`
    - return the maximum of the values over the requested axis
- `pandas.Series.std(axis:'Axis|None'=0, skipna:'bool'=True, ddof:'int'=1,  numeric_only:'bool'=False, **kwargs, )`
    - return sample standard deviation over the requested axis
    - Normalized by N-1 by default. This can be changed using the ddof argument.


In [11]:
# pandas
# series_demo.py
import pandas as pd

reviews = pd.Series([4.6, 4.4, 4.8, 5])
print(reviews) # 1st col is index starting from 0,and 2nd col is data
print("reviews[0]:", reviews[0]) # accessing Series element using index
print("reviews.count():", reviews.count()) # count of non-null elements in Series
print("reviews.mean():", reviews.mean()) # mean of non-null elements in Series
print("reviews.min():", reviews.min()) # min of non-null elements in Series
print("reviews.max():", reviews.max()) # max of non-null elements in Series
print("reviews.std():", reviews.std()) # sample standard deviation of non-null elements in Series

0    4.6
1    4.4
2    4.8
3    5.0
dtype: float64
reviews[0]: 4.6
reviews.count(): 4
reviews.mean(): 4.7
reviews.min(): 4.4
reviews.max(): 5.0
reviews.std(): 0.25819888974716104


## 302. Use Custom indices
-