# Pandas Series
Series is a one-dimensional labeled array capable of holding any data type (integers, strings, floating point numbers, Python objects, etc.). The axis labels are collectively referred to as the index.

Make sure that you have installed Pandas on your machine.

To install pandas, follow:
```
py -m pip install pandas
```
or
```
pip install pandas
```

In [3]:
import pandas as pd

In [36]:
numbers = [1, 2, 3, 4, 5]

# Create a series
pd.Series(numbers) # Shif + Tab to open documentation

0    1
1    2
2    3
3    4
4    5
dtype: int64

In [37]:
letters = ['a', 'b', 'c', 'd', 'e']

# Create a series of strings are considered objects
pd.Series(letters)

0    a
1    b
2    c
3    d
4    e
dtype: object

In [38]:
pd.Series(data=letters, index=numbers) # Index can be any data typed

1    a
2    b
3    c
4    d
5    e
dtype: object

In [12]:
s = pd.Series(data=letters, index=['letter1', 'letter2', 'letter3', 'letter4', 'letter5'])

In [13]:
s['letter2']

'b'

In [19]:
data = {
    'name':'Arun',
    'age':22,
    'JobRole': "Backend Developer"
}

# Create a series with the dictionary (are considered object)
pd.Series(data)

name                    Arun
age                       22
JobRole    Backend Developer
dtype: object

In [22]:
# It use the index to find those keys, but if it doesn't exist will provide NaN
pd.Series(data=data, index=['name', 'age', 'Job'])

name    Arun
age       22
Job      NaN
dtype: object

+ Attribute and Methods - An attribute gives you data about the object, and a method actually performs operations on the object, which could end up changing the object.
+ .dtype - Returns the type of the series
+ .values - returns the series as an array
+ .index - Returns information about the index
+ .hasnans - Will check whether a series has missing values (or NaNs)
+ .shape - Returns the shape of a Series as a tuple(For eg. 5 lines and 1 column) - Usually used with dataframe
+ .size - Returns the size of the se

## Attributes

In [23]:
numbers = [1, 2, 3, 4, 5]
letters = ['a', 'b', 'c', 'd', 'e']

# Create a series of numbers
number_series = pd.Series(numbers)
number_series

0    1
1    2
2    3
3    4
4    5
dtype: int64

In [24]:
# Create a series of letters
letter_series = pd.Series(letters)
letter_series

0    a
1    b
2    c
3    d
4    e
dtype: object

In [25]:
number_series.dtype

dtype('int64')

In [27]:
letter_series.dtype # Returns 'O' which represents object

dtype('O')

In [29]:
number_series.index

RangeIndex(start=0, stop=5, step=1)

In [30]:
letter_series.values

array(['a', 'b', 'c', 'd', 'e'], dtype=object)

In [31]:
# Create a series with NaNs
incomplete_series = pd.Series(data=['a', 1, None], index=[1, 2, 3])
incomplete_series

1       a
2       1
3    None
dtype: object

In [32]:
incomplete_series.hasnans

True

In [33]:
number_series.shape

(5,)

In [35]:
incomplete_series.size

3

## Methods

+ .max() - Returns the max of the values
+ .min() - Returns the min of the values
+ .sum() - Returns the sum of the values
+ .mean() - Returns the mean of the values
+ .median() - Returns the median of the values
+ .mode() Return the mode of the values
+ .idxmax() - Returns the index of the max value
+ .idxmin() - Returns the index of the min value
+ .isnull() - Returns a series that checks whether each value is null or not
+ .round() - Round each value to the nearest integer value

In [52]:
new_series = pd.Series(data=[1.0, 2.2, 3.3, None, 4.6111, 8.2464, 1.0, 2.2, 4.1, 2.2])

In [53]:
new_series.max()

8.2464

In [54]:
new_series.min()

1.0

In [55]:
new_series.sum()

28.857499999999998

In [56]:
new_series.mean()

3.2063888888888887

In [57]:
new_series.median()

2.2

In [58]:
new_series.mode()

0    2.2
dtype: float64

In [59]:
new_series.idxmax()

5

In [60]:
new_series.idxmin()

0

In [50]:
new_series.isnull()

0    False
1    False
2    False
3     True
4    False
5    False
6    False
7    False
8    False
9    False
dtype: bool

In [61]:
new_series.round()

0    1.0
1    2.0
2    3.0
3    NaN
4    5.0
5    8.0
6    1.0
7    2.0
8    4.0
9    2.0
dtype: float64

In [63]:
# Round to three decimal places
new_series.round(3)

0    1.000
1    2.200
2    3.300
3      NaN
4    4.611
5    8.246
6    1.000
7    2.200
8    4.100
9    2.200
dtype: float64

## Handle CSV Files

In [68]:
# Read a CSV file and squeeze the columns into a series
# .squeeze converts the dataframe into a series
richest_persons = pd.read_csv('TopRichestInWorld.csv', usecols=['Name']).squeeze()
richest_persons

0                     Elon Musk
1                    Jeff Bezos
2      Bernard Arnault & family
3                    Bill Gates
4                Warren Buffett
                 ...           
96             Vladimir Potanin
97         Harold Hamm & family
98                 Sun Piaoyang
99           Luo Liguo & family
100                   Peter Woo
Name: Name, Length: 101, dtype: object

In [69]:
# Check the type of richest_persons
type(richest_persons)

pandas.core.series.Series

In [71]:
# Create a csv file (this will be placed in the current folder)
richest_persons.to_csv('test.csv', index=False) # Set to False to avoid extra index

In [72]:
pd.read_csv('test.csv')

Unnamed: 0,Name
0,Elon Musk
1,Jeff Bezos
2,Bernard Arnault & family
3,Bill Gates
4,Warren Buffett
...,...
96,Vladimir Potanin
97,Harold Hamm & family
98,Sun Piaoyang
99,Luo Liguo & family


In [81]:
richest = pd.read_csv('test.csv').squeeze()

In [82]:
richest

0                     Elon Musk
1                    Jeff Bezos
2      Bernard Arnault & family
3                    Bill Gates
4                Warren Buffett
                 ...           
96             Vladimir Potanin
97         Harold Hamm & family
98                 Sun Piaoyang
99           Luo Liguo & family
100                   Peter Woo
Name: Name, Length: 101, dtype: object

+ .head() - Returns the top 5 rows
+ .tail() - Returns the last 5 rows

In [83]:
dataframe = pd.read_csv('TopRichestInWorld.csv')

In [84]:
dataframe.head() # Returns the top 5 rows

Unnamed: 0,Name,NetWorth,Age,Country/Territory,Source,Industry
0,Elon Musk,"$219,000,000,000",50,United States,"Tesla, SpaceX",Automotive
1,Jeff Bezos,"$171,000,000,000",58,United States,Amazon,Technology
2,Bernard Arnault & family,"$158,000,000,000",73,France,LVMH,Fashion & Retail
3,Bill Gates,"$129,000,000,000",66,United States,Microsoft,Technology
4,Warren Buffett,"$118,000,000,000",91,United States,Berkshire Hathaway,Finance & Investments


In [85]:
dataframe.head(10) # Returns the top 10 rows

Unnamed: 0,Name,NetWorth,Age,Country/Territory,Source,Industry
0,Elon Musk,"$219,000,000,000",50,United States,"Tesla, SpaceX",Automotive
1,Jeff Bezos,"$171,000,000,000",58,United States,Amazon,Technology
2,Bernard Arnault & family,"$158,000,000,000",73,France,LVMH,Fashion & Retail
3,Bill Gates,"$129,000,000,000",66,United States,Microsoft,Technology
4,Warren Buffett,"$118,000,000,000",91,United States,Berkshire Hathaway,Finance & Investments
5,Larry Page,"$111,000,000,000",49,United States,Google,Technology
6,Sergey Brin,"$107,000,000,000",48,United States,Google,Technology
7,Larry Ellison,"$106,000,000,000",77,United States,software,Technology
8,Steve Ballmer,"$91,400,000,000",66,United States,Microsoft,Technology
9,Mukesh Ambani,"$90,700,000,000",64,India,diversified,Diversified


In [87]:
dataframe.head(-10) # Returns all rows, excluding last 10 rows

Unnamed: 0,Name,NetWorth,Age,Country/Territory,Source,Industry
0,Elon Musk,"$219,000,000,000",50,United States,"Tesla, SpaceX",Automotive
1,Jeff Bezos,"$171,000,000,000",58,United States,Amazon,Technology
2,Bernard Arnault & family,"$158,000,000,000",73,France,LVMH,Fashion & Retail
3,Bill Gates,"$129,000,000,000",66,United States,Microsoft,Technology
4,Warren Buffett,"$118,000,000,000",91,United States,Berkshire Hathaway,Finance & Investments
...,...,...,...,...,...,...
86,Vladimir Lisin,"$18,400,000,000",65,Russia,"steel, transport",Metals & Mining
87,Fan Hongwei & family,"$18,200,000,000",55,China,petrochemicals,Energy
88,Lakshmi Mittal,"$17,900,000,000",71,India,steel,Metals & Mining
89,Andrew Forrest,"$17,800,000,000",60,Australia,mining,Metals & Mining


In [88]:
dataframe.tail() # Returns the last 5 rows

Unnamed: 0,Name,NetWorth,Age,Country/Territory,Source,Industry
96,Vladimir Potanin,"$17,300,000,000",61,Russia,metals,Metals & Mining
97,Harold Hamm & family,"$17,200,000,000",76,United States,oil & gas,Energy
98,Sun Piaoyang,"$17,100,000,000",63,China,pharmaceuticals,Healthcare
99,Luo Liguo & family,"$17,000,000,000",66,China,chemicals,Manufacturing
100,Peter Woo,"$17,000,000,000",75,Hong Kong,real estate,Real Estate


In [90]:
dataframe.tail(10) # Returns the last 10 rows

Unnamed: 0,Name,NetWorth,Age,Country/Territory,Source,Industry
91,Savitri Jindal & family,"$17,700,000,000",72,India,steel,Metals & Mining
92,Wang Wenyin,"$17,700,000,000",54,China,"mining, copper products",Metals & Mining
93,Li Xiting,"$17,600,000,000",71,Singapore,medical devices,Healthcare
94,Stefan Persson,"$17,600,000,000",74,Sweden,H&M,Fashion & Retail
95,Steve Cohen,"$17,400,000,000",65,United States,hedge funds,Finance & Investments
96,Vladimir Potanin,"$17,300,000,000",61,Russia,metals,Metals & Mining
97,Harold Hamm & family,"$17,200,000,000",76,United States,oil & gas,Energy
98,Sun Piaoyang,"$17,100,000,000",63,China,pharmaceuticals,Healthcare
99,Luo Liguo & family,"$17,000,000,000",66,China,chemicals,Manufacturing
100,Peter Woo,"$17,000,000,000",75,Hong Kong,real estate,Real Estate


In [92]:
dataframe.tail(-10) # Returns all rows excluding first 10 rows

Unnamed: 0,Name,NetWorth,Age,Country/Territory,Source,Industry
10,Gautam Adani & family,"$90,000,000,000",59,India,"infrastructure, commodities",Diversified
11,Michael Bloomberg,"$82,000,000,000",80,United States,Bloomberg LP,Media & Entertainment
12,Carlos Slim Helu & family,"$81,200,000,000",82,Mexico,telecom,Telecom
13,Francoise Bettencourt Meyers & family,"$74,800,000,000",68,France,L'Oréal,Fashion & Retail
14,Mark Zuckerberg,"$67,300,000,000",37,United States,Facebook,Technology
...,...,...,...,...,...,...
96,Vladimir Potanin,"$17,300,000,000",61,Russia,metals,Metals & Mining
97,Harold Hamm & family,"$17,200,000,000",76,United States,oil & gas,Energy
98,Sun Piaoyang,"$17,100,000,000",63,China,pharmaceuticals,Healthcare
99,Luo Liguo & family,"$17,000,000,000",66,China,chemicals,Manufacturing
