## Edureka 

**Ref: https://www.youtube.com/watch?v=UB3DE5Bgfx4**

`NB: (YT Title) -- Python Pandas Tutorial | Data Analysis with Python Pandas | Python Training | Edureka`


#### Agenda
+ Introduction to Pandas
+ Dataframes & Series
+ How to view data?
+ Selecting Data
+ Handling Missing Data
+ Pandas Operations
+ Merge, Group & Reshape Data
+ Time Series & Categoricals
+ Plotting using Pandas

**Usecases of Pandas:**
+ economics
+ stock predictions
+ recommendation systems
+ neuro-science
+ statistics
+ can be used for advertising

Two of the most integrated parts of Pandas are:
1. Dataframe
2. Series

> Definition of **Dataframe**: a two-dimensional, size-mutable, potentially heterogeneous tabular data. <br/>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
`pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=None)` <br/>

> Definition of **Series**: one-dimensional labelled array capable of holding data of any type. <br/>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
`pandas.Series(array, index, dtype, name, copy, values, size, hasnans, memory_usage([index, deep]))`

 <br/>

[NB]: Series assign indicies with every values from the python's list/tuple, but not with set type data.

In [2]:
import pandas as pd
import numpy as np

In [19]:
# create a series manually
list_var = [1,2,3,4,5,6,np.nan,7]
s= pd.Series(list_var, index=[pd.date_range('20220402', periods=8)])
s

2022-04-02    1.0
2022-04-03    2.0
2022-04-04    3.0
2022-04-05    4.0
2022-04-06    5.0
2022-04-07    6.0
2022-04-08    NaN
2022-04-09    7.0
dtype: float64

In [13]:
# populates a date-range of 10 consecutive dates starting from "02 March, 2022"
d = pd.date_range('20220402', periods=10) # populates 10 consecutive dates from "2 March, 2022"
d

DatetimeIndex(['2022-04-02', '2022-04-03', '2022-04-04', '2022-04-05',
               '2022-04-06', '2022-04-07', '2022-04-08', '2022-04-09',
               '2022-04-10', '2022-04-11'],
              dtype='datetime64[ns]', freq='D')

In [12]:
# create a numpy array which contains a list of 10 elems & each elem (array) contains 4 sub-elems.
# Syntax:  x = np.random.randn((num_of_main_list_elem, num_of_sub_list_elem))
x_rand = np.random.randn(10,4)
x_rand

array([[-0.8681373 , -0.45302135, -0.90581098, -0.16005717],
       [ 0.60205859,  0.49542359,  0.32660274, -0.58896524],
       [ 0.60294495, -1.37773496, -0.45624974, -1.49111595],
       [ 0.52591276,  0.86354085, -0.83283382,  0.82158628],
       [ 1.21933649,  0.51320728, -1.15663766,  0.82077322],
       [ 0.7854078 , -0.32425282,  0.91271905, -0.73428955],
       [-0.5911044 ,  1.30692527, -0.41831827,  1.28448261],
       [ 1.01696513, -1.31206945, -1.4412261 , -1.30231951],
       [-0.57847601, -0.19225266, -1.30346843,  2.50351278],
       [-0.07984346, -0.07269122,  1.04702417,  0.16653507]])

In [23]:
# create a datafram using a numpy array with the datetime-index & labelled columns (a custom list)
col_names = ['a','b','c','d']
df = pd.DataFrame(x_rand, index=d, columns=col_names)
df

Unnamed: 0,a,b,c,d
2022-04-02,-0.868137,-0.453021,-0.905811,-0.160057
2022-04-03,0.602059,0.495424,0.326603,-0.588965
2022-04-04,0.602945,-1.377735,-0.45625,-1.491116
2022-04-05,0.525913,0.863541,-0.832834,0.821586
2022-04-06,1.219336,0.513207,-1.156638,0.820773
2022-04-07,0.785408,-0.324253,0.912719,-0.73429
2022-04-08,-0.591104,1.306925,-0.418318,1.284483
2022-04-09,1.016965,-1.312069,-1.441226,-1.30232
2022-04-10,-0.578476,-0.192253,-1.303468,2.503513
2022-04-11,-0.079843,-0.072691,1.047024,0.166535


In [24]:
# create a datafram using a python dictionary
dict_data = {
    'A': [1,2,3,4],
    'B': pd.Timestamp('20220402'), # create the date "2022-04-02" same as the size of the dataframe
    'C': 
}
df2 = pd.DataFrame(dict_data)
df2

# [NB]: Create a fixed pandas-dateTime-object & it'll automatically replicate the date "2022-04-02" as long as the size of the dataframe.
# pd.timestamp => Pandas replacement for python datetime.datetime object.

Unnamed: 0,A,B
0,1,2022-04-02
1,2,2022-04-02
2,3,2022-04-02
3,4,2022-04-02


### Timestamp of YT Video: 11:41