### Implementation of Pandas In Real DataSet
**Student DataSet** - https://www.kaggle.com/datasets/abcsds/pokemon

**Pandas all Operations** - https://www.youtube.com/watch?v=vmEHCJofslg
<br></br>
### Important Builtin Methods
```import pandas as pd ```

1. **DataFrame Creation**
   1. pd.DataFrame(data): Create a DataFrame from various data structures like arrays, lists, dictionaries, or NumPy arrays.
   2. pd.read_csv(file_path): Read a CSV file and create a DataFrame.
   <br></br>
2. **Data Exploration and Manipulation**
   1. df.head(n): Get the first n rows of the DataFrame.
   2. df.tail(n): Get the last n rows of the DataFrame.
   3. df.info(): Display information about the DataFrame, including data types and column names.
   4. df.describe(): Generate descriptive statistics of the DataFrame.
   5. df.shape: Get the dimensions (rows, columns) of the DataFrame.
   6. df.columns: Get the column labels of the DataFrame.
   7. df.dtypes: Get the data types of each column.
   8. df.isnull(): Check for missing values in the DataFrame.
   9. df.dropna(): Remove rows or columns with missing values.
   10. df.groupby(by): Group the DataFrame by one or more columns.
   11. df.sort_values(by): Sort the DataFrame based on one or more columns.
   12. df.rename(columns): Rename columns of the DataFrame.
   13. df.merge(df2): Merge two DataFrames based on common columns.
<br></br>
3. **Data Selection and Filtering**
   1. df.index = Used to get all the index of the data frame
   2. df[column_name]: Select a specific column or multiple columns.
   3. df.loc[row_indexer, column_indexer]: Access specific rows and columns using labels.
   4. df.iloc[row_indexer, column_indexer]: Access specific rows and columns using integer-based indexing.
   5. df.query(expression): Filter rows based on a Boolean expression.


In [26]:
import pandas as pd

filePath = "C:/Users/NANDANGN/Desktop/Python Programming/Python_Programming/Pokemon.csv"

df = pd.read_csv(filePath)

# Understanding data head and tail
head = df.head()
tail = df.tail()
# print(head)
# print(tail)


### Reading Data In Pandas
1. reading headers - df.columns
2. reading each row - df.iloc['number of rows']
3. reading each column - df['Col Name']
4. reading specific location (R, C) - df.iloc[row, col]
5. iterating each rows in for loop- df.iterrows() 

### Sorting and Describing Data
1. Describing data - df.describe()
2. Sortign Data based on order - df.sort_values('Column',ascending=False/True)

### Making Changes in Data
1. Creating a new Column
2. Creating a new Column -> df['columnName'] = datao
3. Dropping columns -> df.drop(columns= ColName)

### Filetering Data & Saving the Data
1. Example 1 -> df.loc[(df['Type 1']=='Grass') & (df['Total']>=500)]
2. Example 2 -> 
   1. new_df = df.loc[(df['Type 1'] =='Grass') & (df['Type 2'] =='Poison') & (df['Total']>=500)]
   2. new_df.reset_index(drop=True,inplace=True) #dropping the existing index and forming new index
   3. new_df.to_csv('Strong_Polemon.csv')
3. Example 3 ->
   1. Checking for contains: Considering the data which contains the required string -> df.loc[(df['Name'].str.contains('Mega'))]
   2. Checking For Not Contains: df.loc[(df['Type 1'].str.contains('Grass')) & (df['Type 2'].str.contains('Poison')==False)]
   3. Using Regex for more filter(import re)
      1. df.loc[df['Name'].str.contains('pi[a-z]*',flags=re.I,regex=True)]
      2. df.loc[(df['Type 1'].str.contains('Grass')) & (df['Type 2'].str.contains('Poison|Fighting', flags=re.I,regex=True))]

### Conditional Changes
1. Changing value inside the table -> df.loc[df['Type 1']=='Fairy', 'Type 1']='Fire'

### Groupby(Sum, Mean, Counting)
1. df.groupby(['Type 1']).mean().sort_values('Defense',ascending=False)
2. df.groupby(['Name']).sum().sort_values('Total')


In [101]:
## reading headers
columns = df.columns
# print(columns)

## Read each column
name = df['Name'] #also we can have multiple columns df[['Name','Speed']]
# print(name)
# print(name[0:5])

## Reading each row
# print(df.iloc[1])

## reading specific location (R, C)
# print(df.iloc[2,2])

## iteration rows
# for index, row in df.iterrows():
#   print(index, row)

grass = df.loc[df['Type 1']=='Grass']
grass


## Describing data
describeData = df.describe()
describeData

## Sortign Data based on order
sortedData = df.sort_values('Name',ascending=False) #ascending=False gives descending data
sortedData

## Creating a new Column -> df['columnName'] = datao
df['Total'] = df['HP'] + df['Attack'] + df['Defense'] + df['Sp. Atk'] + df['Sp. Def']+df['Speed']
df

### Way 2 of sum
df['Total'] = df.iloc[:,4:10].sum(axis=1)

## Dropping columns -> df.drop(columns= ColName)
# df = df.drop(columns='Total')
df

print(df['Total'].max())


## Filetering Data & Saving the Data
df.loc[(df['Type 1']=='Grass') & (df['Total']>=500)]
new_df = df.loc[(df['Type 1'] =='Grass') & (df['Type 2'] =='Poison') & (df['Total']>=500)]
new_df.reset_index(drop=True,inplace=True) #dropping the existing index and forming new index
new_df.to_csv('Strong_Polemon.csv')

### Considering the data which contains the required string
df.loc[(df['Name'].str.contains('Mega'))]
df.loc[(df['Type 1'].str.contains('Grass'))]
df.loc[(df['Type 1'].str.contains('Grass')) & (df['Type 2'].str.contains('Poison')==False)]

### Using Regex for more filter
import re
df.loc[df['Name'].str.contains('pi[a-z]*',flags=re.I,regex=True)] ##displays data that contains pi in the Name column
df.loc[(df['Type 1'].str.contains('Grass')) & (df['Type 2'].str.contains('Poison|Fighting', flags=re.I,regex=True))]

## Conditional Changes

df.loc[df['Type 1']=='Fairy', 'Type 1']='Fire'
df['Type 2'].unique()


## Groupby
df.groupby(['Type 1']).mean().sort_values('Defense',ascending=False)

df.groupby(['Name']).sum().sort_values('Attack',ascending=False)


df['count'] = 1
df.groupby(['Name','Type 1','Type 2']).count()['count']

780


Unnamed: 0,#,Name,Type 1,Type 2,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary,Total,count
Fire,1,1,,0,1,1,1,1,1,1,1,1,,
Grass,4,4,,4,4,4,4,4,4,4,4,4,,
Fire,4,4,,3,4,4,4,4,4,4,4,4,,
Water,1,1,,0,1,1,1,1,1,1,1,1,,
Bug,2,2,,0,2,2,2,2,2,2,2,2,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Fairy,1,1,,0,1,1,1,1,1,1,1,1,,
Flying,2,2,,2,2,2,2,2,2,2,2,2,,
Fire,1,1,,1,1,1,1,1,1,1,1,1,,
Psychic,2,2,,2,2,2,2,2,2,2,2,2,,


In [30]:
df

Unnamed: 0,#,Name,Type 1,Type 2,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary,Total
0,1,Bulbasaur,Grass,Poison,45,49,49,65,65,45,1,False,318
1,2,Ivysaur,Grass,Poison,60,62,63,80,80,60,1,False,405
2,3,Venusaur,Grass,Poison,80,82,83,100,100,80,1,False,525
3,3,VenusaurMega Venusaur,Grass,Poison,80,100,123,122,120,80,1,False,625
4,4,Charmander,Fire,,39,52,43,60,50,65,1,False,309
...,...,...,...,...,...,...,...,...,...,...,...,...,...
795,719,Diancie,Rock,Fairy,50,100,150,100,150,50,6,True,600
796,719,DiancieMega Diancie,Rock,Fairy,50,160,110,160,110,110,6,True,700
797,720,HoopaHoopa Confined,Psychic,Ghost,80,110,60,150,130,70,6,True,600
798,720,HoopaHoopa Unbound,Psychic,Dark,80,160,60,170,130,80,6,True,680
