It contains data structures and data manipulation tools designed to make data cleaning and analysis fast and easy in Python.

In [2]:
import pandas as pd
from pandas import Series, DataFrame

**Panda Series**

In [8]:
data = [1,2,3,4,5]
index = ['a', 'b', 'c', 'd', 'e']
s = pd.Series(data, index=index)
print(s)
print(s.values)
print(s.index)
print(s[1])

a    1
b    2
c    3
d    4
e    5
dtype: int64
[1 2 3 4 5]
Index(['a', 'b', 'c', 'd', 'e'], dtype='object')
2


**creating a Series from the dictionary**

In [13]:
dict = {'apple'  : 5,
        'peach'  : 9,
        'banana' : 10,
        'mango'  : 1}
s1 = pd.Series(dict)
# keys of the dictionary are used as the index labels
# values of the dictionary are used as the values in the Series
print(s1)
     

apple      5
peach      9
banana    10
mango      1
dtype: int64


**Changing Series index**

In [16]:
s = pd.Series([1,2,3,4,5], index=['a','b','c','d','e'])

print(s)

# Alter the series index
s.index = [10,20,30,40,50]

# Print the series
print("\nSeries  with modified Index: ")
print(s)


a    1
b    2
c    3
d    4
e    5
dtype: int64

Series  with modified Index: 
10    1
20    2
30    3
40    4
50    5
dtype: int64


**Operations with Pandas Series**

In [17]:
s1 = pd.Series([5,4,3,2,1])
s2 = pd.Series([1,2,3,4,5])

print(s1 + s2)
print(s2 - s1)
print (s2 * s1)

0    6
1    6
2    6
3    6
4    6
dtype: int64
0   -4
1   -2
2    0
3    2
4    4
dtype: int64
0    5
1    8
2    9
3    8
4    5
dtype: int64


**Acessing Elements**

In [20]:
s = pd.Series([1,2,3,4,5], index=['a','b','c','d','e'])

# access element by label
print(s.loc['a'])  

# access element by position
print(s.iloc[0])  

1
1


**Slicing a Pandas Series**

In [22]:
# slice the Series using index labels
print (s['b' : 'd'])

# slice the Series using positions
print(s[2:])

b    2
c    3
d    4
dtype: int64
c    3
d    4
e    5
dtype: int64


# DataFrame in Pandas

A DataFrame represents a rectangular table of data and contains an ordered collec‐
tion of columns, each of which can be a different value type (numeric, string,
boolean, etc.). The DataFrame has both a row and column index; it can be thought of
as a dict of Series all sharing the same index.

**Using Dictionaries**

In [23]:
import pandas as pd

dict = {'name':['Ali', 'Josh', 'Becky'],
    'age' : [30, 25, 40],
    'city' : ['New York', 'San Francisco', 'Chicago']}

# Create a DataFrame from the dictionary
frame = pd.DataFrame(dict)

# Print the DataFrame
print(frame)

    name  age           city
0    Ali   30       New York
1   Josh   25  San Francisco
2  Becky   40        Chicago


**list of dictionaries**

In [24]:
# Create a list of dictionaries
list = [{'name' : 'adam', 'age' : 25, 'city' : 'sans fransisco'},
        {'name' : 'adam', 'age' : 25, 'city' : 'chicage'},
        {'name' : 'adam', 'age' : 25, 'city' : 'new york'}]

# Create a DataFrame from the list
frame = pd.DataFrame(list)

# Print the DataFrame
print(frame)

   name  age            city
0  adam   25  sans fransisco
1  adam   25         chicage
2  adam   25        new york


**Using CSV**

In [25]:
##frame = pd.read_csv("File Path")

#print(frame)

# Accessing Data in Df

**Rows**

In [28]:
data = {'name': ['sin', 'jade', 'Romeo'],
        'age': [30, 25, 40],
        'city': ['New York', 'San Francisco', 'Chicago']}
df = pd.DataFrame(data)

# Access a row by index
print(df.loc[2])

name      Romeo
age          40
city    Chicago
Name: 2, dtype: object


**Column**

In [30]:
df = pd.DataFrame(data)
# Access a column by name
print(df['name'])

0      sin
1     jade
2    Romeo
Name: name, dtype: object


In [31]:
# Access a cell by row and column
print(df.loc[1, 'name'])

jade


In [41]:
data = {'name': ['sin', 'jade', 'Romeo'],
        'age': [30, 25, 40],
        'city': ['New York', 'San Francisco', 'Chicago']}
df = pd.DataFrame(data)

df[df['age']>30]

Unnamed: 0,name,age,city
2,Romeo,40,Chicago


# Modifying Df

In [32]:
# Add a new column
df['state'] = ['NY', 'CA', 'IL']

# print a DataFrame
print(df)

    name  age           city state
0    sin   30       New York    NY
1   jade   25  San Francisco    CA
2  Romeo   40        Chicago    IL


In [33]:
# Update a column
df['name'] = ['Adam', 'Alice', 'Sophia']

# print a DataFrame
print(df)

     name  age           city state
0    Adam   30       New York    NY
1   Alice   25  San Francisco    CA
2  Sophia   40        Chicago    IL


In [34]:
# Remove a column
df = df.drop('age', axis= 1)

# print the DataFrame
print(df)

     name           city state
0    Adam       New York    NY
1   Alice  San Francisco    CA
2  Sophia        Chicago    IL


In [42]:
# Sorting
df_sorted = df.sort_values('age', ascending = False)

# Print the sorted DataFrame
print(df_sorted)

    name  age           city
2  Romeo   40        Chicago
0    sin   30       New York
1   jade   25  San Francisco
