In [21]:
import pandas as pd

## Different dimensions datastructures

1. 1D - Series (Like an array)
2. 2D - DataFrames (Like a matrix, sql table etc)
3. 3D - Panel Data (PAN+DA)
4. 4D+ - Pandel$d

### Series

In [22]:
s1 = pd.Series([10,20,30,40])
s1

0    10
1    20
2    30
3    40
dtype: int64

> As you can see in above output, a default indexing is provided. We can also provide custom indexing like the following

In [23]:
s2 = pd.Series([10,20,30,40,50], index=['Ten', 'Twen', 'Thirty', 'Forty', 'Fifty'])
s2

Ten       10
Twen      20
Thirty    30
Forty     40
Fifty     50
dtype: int64

In [24]:
'Ten' in s2

True

In [25]:
'Hundred' in s2

False

#### Convert a pandas series to python native dictionary

In [26]:
names = ['rahul', 'sachin', 'nishu', 'parshv']
ages = [20, 25, 25, 2]

s3 = pd.Series(ages, index=names)
print(s3)

print(s3['sachin'])

print(s3.to_dict())

rahul     20
sachin    25
nishu     25
parshv     2
dtype: int64
25
{'rahul': 20, 'sachin': 25, 'nishu': 25, 'parshv': 2}


#### We can also build pandas series from python dictionary

In [27]:
s4_from_dict = pd.Series(s3)
s4_from_dict

rahul     20
sachin    25
nishu     25
parshv     2
dtype: int64

### Question - What is the difference between dictionary and series ?

> Dictionary is used to store key-value pairs like object in javascript
> Pandas series is wrapper over dictionary and is used to store array-like objects, scalars and dict etc
> Series internally store data as dictionary only
> Series has plenty of utility methods while dictionary has less utility method
> e.g. isNull, max, median, multiply, nonzero etc
> https://stackoverflow.com/questions/43635694/difference-between-dictionary-and-pandas-series-in-python
> http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.html

In [32]:
pd.isnull(s4_from_dict)

rahul     False
sachin    False
nishu     False
parshv    False
dtype: bool

#### Addtion of two series
1. Values for common keys will be added
2. Keys present in one series and not present in other series will become keys in result with NaN as value

In [29]:
s1 + s2

  return this.join(other, how=how, return_indexers=return_indexers)


0        NaN
1        NaN
2        NaN
3        NaN
Ten      NaN
Twen     NaN
Thirty   NaN
Forty    NaN
Fifty    NaN
dtype: float64

In [31]:
s3 + s4_from_dict

rahul     40
sachin    50
nishu     50
parshv     4
dtype: int64

#### We can attach a name to series and check the indexes and values separately

In [33]:
s4_from_dict.name = "Series made from dictionary"
print(s4_from_dict)
print(s4_from_dict.index)
print(s4_from_dict.values)

rahul     20
sachin    25
nishu     25
parshv     2
Name: Series made from dictionary, dtype: int64
Index(['rahul', 'sachin', 'nishu', 'parshv'], dtype='object')
[20 25 25  2]


In [34]:
#### Get subset of series

In [35]:
s4_from_dict[['sachin', 'rahul']]

sachin    25
rahul     20
Name: Series made from dictionary, dtype: int64

In [37]:
s5 = s4_from_dict * 2
s5

rahul     40
sachin    50
nishu     50
parshv     4
Name: Series made from dictionary, dtype: int64

In [38]:
#### More Filtering Examples

In [39]:
s5[s5 > 20]

rahul     40
sachin    50
nishu     50
Name: Series made from dictionary, dtype: int64

In [42]:
#### Adding one more row
s5['rekha'] = 40

In [43]:
s5

rahul     40
sachin    50
nishu     50
parshv     4
rekha     40
Name: Series made from dictionary, dtype: int64

#### Changing values based on condition

Change the age to (-5) which are >= 50

In [45]:
s5[s5 >= 50] = s5-5
s5

rahul     40
sachin    45
nishu     45
parshv     4
rekha     40
Name: Series made from dictionary, dtype: int64

In [47]:
s5[s5 >= 20]

rahul     40
sachin    45
nishu     45
rekha     40
Name: Series made from dictionary, dtype: int64