# Reference Document
This Ipython notebook references many of pandas methods that we used in greater detail

In [1]:
import pandas as pd

data = [['Tom', 10], ['Tom', 15], ['JUlI', 14], ['alBlcld', 20]] #putting data into a list
df = pd.DataFrame(data, columns = ['Name', 'Age']) #putting list data into dataframe

df #printing dataframe

Unnamed: 0,Name,Age
0,Tom,10
1,Tom,15
2,JUlI,14
3,alBlcld,20


In [2]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Name    4 non-null      object
 1   Age     4 non-null      int64 
dtypes: int64(1), object(1)
memory usage: 192.0+ bytes


### apply()
- Applies a function along the axis of the DataFrame
- Takes series object from either rows(axis=0) or columns(axis=1)

In [3]:
df['Age'] = df['Age'].apply(lambda x: x + 5) #Applying lambda function to 'Age' columm
df #Print dataframe

Unnamed: 0,Name,Age
0,Tom,15
1,Tom,20
2,JUlI,19
3,alBlcld,25


### split()
- The split function splits a string around a given separator/delimiter
- The resulting split strings are added to a list
- can only be applied to a string

In [4]:
df['Name_list'] = df['Name'].str.split('l') #Splitting string at l character
df

Unnamed: 0,Name,Age,Name_list
0,Tom,15,[Tom]
1,Tom,20,[Tom]
2,JUlI,19,"[JU, I]"
3,alBlcld,25,"[a, B, c, d]"


In [5]:
#Here we access only specific part of the split
df['Name_split_0'] = df['Name'].str.split('l').str[1] #Accessing first indexed item in split
df

Unnamed: 0,Name,Age,Name_list,Name_split_0
0,Tom,15,[Tom],
1,Tom,20,[Tom],
2,JUlI,19,"[JU, I]",I
3,alBlcld,25,"[a, B, c, d]",B


### replace()
- replaces values given with specified value
- Basic arguements specify ('Value','replacement value')

In [6]:
df['replace'] = df['Name'].str.replace('l' , 'W')
df

Unnamed: 0,Name,Age,Name_list,Name_split_0,replace
0,Tom,15,[Tom],,Tom
1,Tom,20,[Tom],,Tom
2,JUlI,19,"[JU, I]",I,JUWI
3,alBlcld,25,"[a, B, c, d]",B,aWBWcWd


### value_counts()
- Returns series Containing counts of unique values
- Sorted by most frequently-occurring element in desceding order

In [7]:
df['Name'].value_counts()

Tom        2
JUlI       1
alBlcld    1
Name: Name, dtype: int64

### lower()
- converts strings in a series to lowercase

In [10]:
df['Name'] = df['Name'].str.lower()
df

Unnamed: 0,Name,Age,Name_list,Name_split_0,replace
0,tom,15,[Tom],,Tom
1,tom,20,[Tom],,Tom
2,juli,19,"[JU, I]",I,JUWI
3,alblcld,25,"[a, B, c, d]",B,aWBWcWd


### Documentation
- All this information is found on the [pandas documentation](https://pandas.pydata.org/docs/index.html)
- Any other functions can be found there