## Creating Pandas Data Frame

#### Creating an Empty Data Frame

In [4]:
import pandas as pd
import numpy as np

df = pd.DataFrame()

print(df)

Empty DataFrame
Columns: []
Index: []


#### Creating a DataFrame from a List

In [2]:
import pandas as pd

lst = ["Geeks", "For", "Geeks", "is", "portal", "for", "Geeks"]

df = pd.DataFrame(lst)
print(df)

        0
0   Geeks
1     For
2   Geeks
3      is
4  portal
5     for
6   Geeks


#### Creating DataFrame from dict of Numpy Array

In [None]:
data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
df = pd.DataFrame(data, columns=["A", "B", "C"])
print(df)

   A  B  C
0  1  2  3
1  4  5  6
2  7  8  9


#### Creating a DataFrame from a List of Dictionaries 

In [5]:
dict = {
    "name": ["aparna", "pankaj", "sudhir", "Geeku"],
    "degree": ["MBA", "BCA", "M.Tech", "MBA"],
    "score": [90, 40, 80, 98],
}

df = pd.DataFrame(dict)

print(df)

     name  degree  score
0  aparna     MBA     90
1  pankaj     BCA     40
2  sudhir  M.Tech     80
3   Geeku     MBA     98


## Pandas Dataframe Index

### 1. Accessing and Modifying the Index

In [1]:
import pandas as pd

data = {
    "Name": ["John", "Alice", "Bob", "Eve", "Charlie"],
    "Age": [25, 30, 22, 35, 28],
    "Gender": ["Male", "Female", "Male", "Female", "Male"],
    "Salary": [50000, 55000, 40000, 70000, 48000],
}

df = pd.DataFrame(data)
print(df.index)  # Accessing the index

RangeIndex(start=0, stop=5, step=1)


## 2. Setting a Custom Index

In [2]:
# Set 'Name' column as the index
df_with_index = df.set_index("Name")
print(df_with_index)

         Age  Gender  Salary
Name                        
John      25    Male   50000
Alice     30  Female   55000
Bob       22    Male   40000
Eve       35  Female   70000
Charlie   28    Male   48000


### 3. Resetting the index

   index     Name  Age  Gender  Salary
0      0     John   25    Male   50000
1      1    Alice   30  Female   55000
2      2      Bob   22    Male   40000
3      3      Eve   35  Female   70000
4      4  Charlie   28    Male   48000


KeyError: 'Alice'

### 4. Indexing with loc
* The loc[ ] method in pandas allows to access rows and columns of a dataFrame using their labels, making it easy to retrieve specific data points.

In [None]:
row = df.loc["Alice"]
print(row)

KeyError: 'Alice'

### Changing the index

In [14]:
# Set 'Age' as the new index
df_with_new_index = df.set_index("Age")
print(df_with_new_index)

        Name  Gender  Salary
Age                         
25      John    Male   50000
30     Alice  Female   55000
22       Bob    Male   40000
35       Eve  Female   70000
28   Charlie    Male   48000


In [15]:
import pandas as pd

data = {
    "Name": ["John", "Alice", "Bob", "Eve", "Charlie"],
    "Age": [25, 30, 22, 35, 28],
    "Gender": ["Male", "Female", "Male", "Female", "Male"],
    "Salary": [50000, 55000, 40000, 70000, 48000],
}

df = pd.DataFrame(data)
# Display the entire DataFrame
print(df)

      Name  Age  Gender  Salary
0     John   25    Male   50000
1    Alice   30  Female   55000
2      Bob   22    Male   40000
3      Eve   35  Female   70000
4  Charlie   28    Male   48000


### 1. Accessing Columns From DataFrame

In [23]:
# Access the 'Age' column
age_column = df["Age"]
gender_column=df["Gender"]

print(age_column,end="\n")
print(gender_column)

0    25
1    30
2    22
3    35
4    28
Name: Age, dtype: int64
0      Male
1    Female
2      Male
3    Female
4      Male
Name: Gender, dtype: object


### 2. Accessing Rows by Index

In [None]:
# Access the row at index 2 (third row)
second_row = df.iloc[2]
print(second_row)

Name        Bob
Age          22
Gender     Male
Salary    40000
Name: 2, dtype: object


### 3. Accessing Multiple Rows or Columns

In [26]:
# Access the first three rows and the 'Name' and 'Age' columns
subset = df.loc[0:2, ["Name", "Age"]]
print(subset)

    Name  Age
0   John   25
1  Alice   30
2    Bob   22


### 4. Accessing Rows Based on Conditions

In [27]:
# Access rows where 'Age' is greater than 25
filtered_data = df[df["Age"] > 25]
print(filtered_data)

      Name  Age  Gender  Salary
1    Alice   30  Female   55000
3      Eve   35  Female   70000
4  Charlie   28    Male   48000


### 5. Accessing Specific Cells with at and iat
* If you need to access a specific cell, you can use the .at[ ] method for label-based indexing and the .iat[ ] method for integer position-based indexing. These are optimized for fast access to single values.

In [28]:
# Access the 'Salary' of the row with label 2
salary_at_index_2 = df.at[2, "Salary"]
print(salary_at_index_2)

40000
