# 1. Introduction to Pandas

## Overview of Pandas
Pandas is a powerful open-source data analysis and manipulation library built on top of Python. It provides flexible data structures and functions that make it easy to work with structured data such as tables, time series, and matrices. It's widely used for data cleaning, data transformation, and exploratory data analysis.

## Key features of Pandas include:
- Easy handling of missing data.
- Flexible reshaping and pivoting of datasets.
- Fast data aggregation and group operations.
- Seamless integration with other libraries like NumPy, Matplotlib, and SciPy.

## Installation
You can install Pandas using pip, which is the Python package manager.
```bash
pip install pandas
```

Once installed, you can import the library as follows:




In [1]:
import pandas as pd

## Understanding Core Data Structures

### 1. Series
A Pandas **Series** is a one-dimensional labeled array capable of holding any data type (integers, strings, floats, Python objects, etc.). It is similar to a list or array in Python but comes with powerful functionality for data analysis.

#### Example:

In [2]:
data = [1, 2, 3, 4]
pd.Series(data)

0    1
1    2
2    3
3    4
dtype: int64

Each value in the Series has a corresponding index, which defaults to an integer index starting from 0.

### 2. **DataFrame**
A **DataFrame** is a two-dimensional, size-mutable, and potentially heterogeneous tabular data structure with labeled axes (rows and columns). It's the most commonly used structure in Pandas.

#### Example:

In [3]:
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['New York', 'Los Angeles', 'Chicago']
}

pd.DataFrame(data)

Unnamed: 0,Name,Age,City
0,Alice,25,New York
1,Bob,30,Los Angeles
2,Charlie,35,Chicago


DataFrames allow for easy manipulation of data in rows and columns, such as filtering, aggregation, and statistical analysis.

# 2. Data Structures

Pandas provides two key data structures: **Series** and **DataFrame**. These are the building blocks for data manipulation and analysis in Pandas.

## 1. Series: 1D Array-Like Structure
A **Series** is a one-dimensional array-like object that can hold any type of data (integers, floats, strings, Python objects, etc.). Each element in the Series has a label, called the **index**.

### Creating a Series
You can create a Series from a list, dictionary, or scalar value.

In [4]:
data = [10, 20, 30, 40]
series = pd.Series(data)
series

0    10
1    20
2    30
3    40
dtype: int64

You can also specify custom index labels:

In [5]:
series_with_index = pd.Series(data, index=['a', 'b', 'c', 'd'])
series_with_index

a    10
b    20
c    30
d    40
dtype: int64

### Accessing Elements by Index
Elements in a Series can be accessed using their index.

In [6]:
print(series[0])  # Outputs: 10
print(series_with_index['b'])  # Outputs: 20

10
20


You can also use slicing to access a range of elements:

In [7]:
series[1:3]

1    20
2    30
dtype: int64

## 2. DataFrame: 2D Table-Like Structure
A DataFrame is a two-dimensional, tabular data structure with labeled rows and columns, similar to a spreadsheet or SQL table. It can contain data of different types (numeric, string, boolean, etc.) in each column.

### Creating a DataFrame
You can create a DataFrame from various data sources like lists, dictionaries, CSV, or Excel files.

**From a Dictionary:**

In [8]:
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['New York', 'Los Angeles', 'Chicago']
}

df = pd.DataFrame(data)
df


Unnamed: 0,Name,Age,City
0,Alice,25,New York
1,Bob,30,Los Angeles
2,Charlie,35,Chicago


#### From a List of Lists:

In [9]:
data = [
    ['Alice', 25, 'New York'],
    ['Bob', 30, 'Los Angeles'],
    ['Charlie', 35, 'Chicago']
]

df = pd.DataFrame(data, columns=['Name', 'Age', 'City'])
df

Unnamed: 0,Name,Age,City
0,Alice,25,New York
1,Bob,30,Los Angeles
2,Charlie,35,Chicago


#### From a CSV File:

In [10]:
df = pd.read_csv('data/users.csv')
df

Unnamed: 0,name,phone,email,address,postalZip,region,country,list,text,numberrange,currency,alphanumeric
0,Carl Todd,0800 1111,mauris.suspendisse@protonmail.net,"Ap #556-6076 Lorem, Av.",18838,Azad Kashmir,Ireland,1,pede ac urna. Ut tincidunt vehicula risus. Nul...,2,$3.61,NGM77DMT8HR
1,Imani Hartman,070 6524 1233,tempor@hotmail.edu,893-9946 Sociis Rd.,575444,Victoria,India,19,eget odio. Aliquam vulputate ullamcorper magna...,1,$40.74,ROO25QJR7EJ
2,Lynn Carver,0980 367 7825,convallis.erat.eget@aol.org,9010 Auctor. Road,25252,East Region,Pakistan,17,senectus et netus et malesuada fames ac turpis...,1,$84.87,OAM72YZF3QW
3,Sade Martinez,0845 46 43,dui.cum@hotmail.couk,560-9011 Magna. St.,491187,Manisa,Pakistan,1,"dui augue eu tellus. Phasellus elit pede, male...",2,$36.73,HDQ00BJN2QA
4,Uriel Romero,07381 542211,elit.nulla@outlook.com,596-1288 Amet Rd.,6700,Nova Scotia,Singapore,11,vestibulum massa rutrum magna. Cras convallis ...,4,$45.69,VTX15TUS7TC
...,...,...,...,...,...,...,...,...,...,...,...,...
95,Hermione Carr,0800 272602,non.dapibus@protonmail.com,721-3192 Magna Rd.,26930,Queensland,Pakistan,19,"orci, adipiscing non, luctus sit amet, faucibu...",3,$6.14,UDS39INL1CX
96,Darrel Gates,(024) 0676 6116,orci.phasellus@yahoo.ca,"P.O. Box 976, 7887 Diam St.",4144,West Papua,United Kingdom,1,amet ultricies sem magna nec quam. Curabitur v...,7,$79.54,PXO54TUV8NH
97,Marshall Sykes,0800 728475,sapien.cras@protonmail.org,1495 Egestas. Rd.,18102,Utrecht,India,1,mi lacinia mattis. Integer eu lacus. Quisque i...,7,$25.30,IUC28LDY7MM
98,Hyatt Ashley,0873 882 4844,imperdiet@yahoo.org,"P.O. Box 490, 4963 Mauris Avenue",16882,Trøndelag,Vietnam,7,vestibulum. Mauris magna. Duis dignissim tempo...,2,$36.24,HWP81EEB2HK


### Accessing Rows and Columns
You can access rows and columns in a DataFrame by using labels or positions.

#### Accessing Columns:

In [11]:
df['name']

0          Carl Todd
1      Imani Hartman
2        Lynn Carver
3      Sade Martinez
4       Uriel Romero
           ...      
95     Hermione Carr
96      Darrel Gates
97    Marshall Sykes
98      Hyatt Ashley
99     Palmer Bowers
Name: name, Length: 100, dtype: object

In [12]:
df[['name', 'address']]

Unnamed: 0,name,address
0,Carl Todd,"Ap #556-6076 Lorem, Av."
1,Imani Hartman,893-9946 Sociis Rd.
2,Lynn Carver,9010 Auctor. Road
3,Sade Martinez,560-9011 Magna. St.
4,Uriel Romero,596-1288 Amet Rd.
...,...,...
95,Hermione Carr,721-3192 Magna Rd.
96,Darrel Gates,"P.O. Box 976, 7887 Diam St."
97,Marshall Sykes,1495 Egestas. Rd.
98,Hyatt Ashley,"P.O. Box 490, 4963 Mauris Avenue"


#### Accessing Rows:
You can use `.loc[]` for label-based indexing or `.iloc[]` for position-based indexing.

In [13]:
df.iloc[1]

name                                                Imani Hartman
phone                                               070 6524 1233
email                                          tempor@hotmail.edu
address                                       893-9946 Sociis Rd.
postalZip                                                  575444
region                                                   Victoria
country                                                     India
list                                                           19
text            eget odio. Aliquam vulputate ullamcorper magna...
numberrange                                                     1
currency                                                   $40.74
alphanumeric                                          ROO25QJR7EJ
Name: 1, dtype: object

In [14]:
df_with_custom_index = df.set_index('name')
df_with_custom_index

Unnamed: 0_level_0,phone,email,address,postalZip,region,country,list,text,numberrange,currency,alphanumeric
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Carl Todd,0800 1111,mauris.suspendisse@protonmail.net,"Ap #556-6076 Lorem, Av.",18838,Azad Kashmir,Ireland,1,pede ac urna. Ut tincidunt vehicula risus. Nul...,2,$3.61,NGM77DMT8HR
Imani Hartman,070 6524 1233,tempor@hotmail.edu,893-9946 Sociis Rd.,575444,Victoria,India,19,eget odio. Aliquam vulputate ullamcorper magna...,1,$40.74,ROO25QJR7EJ
Lynn Carver,0980 367 7825,convallis.erat.eget@aol.org,9010 Auctor. Road,25252,East Region,Pakistan,17,senectus et netus et malesuada fames ac turpis...,1,$84.87,OAM72YZF3QW
Sade Martinez,0845 46 43,dui.cum@hotmail.couk,560-9011 Magna. St.,491187,Manisa,Pakistan,1,"dui augue eu tellus. Phasellus elit pede, male...",2,$36.73,HDQ00BJN2QA
Uriel Romero,07381 542211,elit.nulla@outlook.com,596-1288 Amet Rd.,6700,Nova Scotia,Singapore,11,vestibulum massa rutrum magna. Cras convallis ...,4,$45.69,VTX15TUS7TC
...,...,...,...,...,...,...,...,...,...,...,...
Hermione Carr,0800 272602,non.dapibus@protonmail.com,721-3192 Magna Rd.,26930,Queensland,Pakistan,19,"orci, adipiscing non, luctus sit amet, faucibu...",3,$6.14,UDS39INL1CX
Darrel Gates,(024) 0676 6116,orci.phasellus@yahoo.ca,"P.O. Box 976, 7887 Diam St.",4144,West Papua,United Kingdom,1,amet ultricies sem magna nec quam. Curabitur v...,7,$79.54,PXO54TUV8NH
Marshall Sykes,0800 728475,sapien.cras@protonmail.org,1495 Egestas. Rd.,18102,Utrecht,India,1,mi lacinia mattis. Integer eu lacus. Quisque i...,7,$25.30,IUC28LDY7MM
Hyatt Ashley,0873 882 4844,imperdiet@yahoo.org,"P.O. Box 490, 4963 Mauris Avenue",16882,Trøndelag,Vietnam,7,vestibulum. Mauris magna. Duis dignissim tempo...,2,$36.24,HWP81EEB2HK


### Indexing and Selecting Data
Pandas provides powerful indexing and selection capabilities:

#### Selecting a Single Element:

In [15]:
df.at[0, 'name']

'Carl Todd'

#### Selecting a Range of Rows and Columns:

In [16]:
df.loc[0:1, 'name':'phone']

Unnamed: 0,name,phone
0,Carl Todd,0800 1111
1,Imani Hartman,070 6524 1233


# 3. Data Manipulation

Pandas offers powerful tools for reading, writing, and manipulating structured data. This section will cover essential operations for data manipulation, including reading and writing data, inspecting data, selecting specific rows and columns, filtering, and handling missing values.

---

## 1. Reading and Writing Data

### Reading Data
Pandas can read data from various file formats, including CSV, Excel, JSON, and SQL databases.

#### Reading from a CSV file:

In [17]:
df = pd.read_csv('data/users.csv')
df.head()  # Displays the first few rows

Unnamed: 0,name,phone,email,address,postalZip,region,country,list,text,numberrange,currency,alphanumeric
0,Carl Todd,0800 1111,mauris.suspendisse@protonmail.net,"Ap #556-6076 Lorem, Av.",18838,Azad Kashmir,Ireland,1,pede ac urna. Ut tincidunt vehicula risus. Nul...,2,$3.61,NGM77DMT8HR
1,Imani Hartman,070 6524 1233,tempor@hotmail.edu,893-9946 Sociis Rd.,575444,Victoria,India,19,eget odio. Aliquam vulputate ullamcorper magna...,1,$40.74,ROO25QJR7EJ
2,Lynn Carver,0980 367 7825,convallis.erat.eget@aol.org,9010 Auctor. Road,25252,East Region,Pakistan,17,senectus et netus et malesuada fames ac turpis...,1,$84.87,OAM72YZF3QW
3,Sade Martinez,0845 46 43,dui.cum@hotmail.couk,560-9011 Magna. St.,491187,Manisa,Pakistan,1,"dui augue eu tellus. Phasellus elit pede, male...",2,$36.73,HDQ00BJN2QA
4,Uriel Romero,07381 542211,elit.nulla@outlook.com,596-1288 Amet Rd.,6700,Nova Scotia,Singapore,11,vestibulum massa rutrum magna. Cras convallis ...,4,$45.69,VTX15TUS7TC


#### Reading from an Excel file:

In [18]:
df = pd.read_excel('data/users.xlsx')
df.head()

Unnamed: 0,name,phone,email,address,postalZip,region,country,list,text,numberrange,currency,alphanumeric
0,Carl Todd,0800 1111,mauris.suspendisse@protonmail.net,"Ap #556-6076 Lorem, Av.",18838,Azad Kashmir,Ireland,1,pede ac urna. Ut tincidunt vehicula risus. Nul...,2,$3.61,NGM77DMT8HR
1,Imani Hartman,070 6524 1233,tempor@hotmail.edu,893-9946 Sociis Rd.,575444,Victoria,India,19,eget odio. Aliquam vulputate ullamcorper magna...,1,$40.74,ROO25QJR7EJ
2,Lynn Carver,0980 367 7825,convallis.erat.eget@aol.org,9010 Auctor. Road,25252,East Region,Pakistan,17,senectus et netus et malesuada fames ac turpis...,1,$84.87,OAM72YZF3QW
3,Sade Martinez,0845 46 43,dui.cum@hotmail.couk,560-9011 Magna. St.,491187,Manisa,Pakistan,1,"dui augue eu tellus. Phasellus elit pede, male...",2,$36.73,HDQ00BJN2QA
4,Uriel Romero,07381 542211,elit.nulla@outlook.com,596-1288 Amet Rd.,6700,Nova Scotia,Singapore,11,vestibulum massa rutrum magna. Cras convallis ...,4,$45.69,VTX15TUS7TC


#### Reading from a JSON file:

In [19]:
df = pd.read_json('data/users.json')
df.head()

Unnamed: 0,name,phone,email,address,postalZip,region,country,list,text,numberrange,currency,alphanumeric
0,Carl Todd,0800 1111,mauris.suspendisse@protonmail.net,"Ap #556-6076 Lorem, Av.",18838,Azad Kashmir,Ireland,1,pede ac urna. Ut tincidunt vehicula risus. Nul...,2,$3.61,NGM77DMT8HR
1,Imani Hartman,070 6524 1233,tempor@hotmail.edu,893-9946 Sociis Rd.,575444,Victoria,India,19,eget odio. Aliquam vulputate ullamcorper magna...,1,$40.74,ROO25QJR7EJ
2,Lynn Carver,0980 367 7825,convallis.erat.eget@aol.org,9010 Auctor. Road,25252,East Region,Pakistan,17,senectus et netus et malesuada fames ac turpis...,1,$84.87,OAM72YZF3QW
3,Sade Martinez,0845 46 43,dui.cum@hotmail.couk,560-9011 Magna. St.,491187,Manisa,Pakistan,1,"dui augue eu tellus. Phasellus elit pede, male...",2,$36.73,HDQ00BJN2QA
4,Uriel Romero,07381 542211,elit.nulla@outlook.com,596-1288 Amet Rd.,6700,Nova Scotia,Singapore,11,vestibulum massa rutrum magna. Cras convallis ...,4,$45.69,VTX15TUS7TC


#### Reading from SQL:

In [20]:
# import sqlite3

# conn = sqlite3.connect('data/users.sql')
# df = pd.read_sql('SELECT * FROM table_name', conn)
# print(df.head())

### Writing Data
You can write a Pandas DataFrame to various formats like CSV, Excel, etc.

#### Writing to a CSV file:

In [21]:
df.to_csv('data/output.csv', index=False)  # Write DataFrame to a CSV file

#### Writing to an Excel file:

In [22]:
df.to_excel('data/output.xlsx', index=False)

## 2. Viewing and Inspecting Data
### Viewing Data
Pandas provides easy ways to view and inspect your data.

#### Viewing the First/Last Rows:
`head()`: Displays the first few rows.
`tail()`: Displays the last few rows.

In [23]:
print(df.head())  # First 5 rows by default
print(df.tail(3))  # Last 3 rows


            name          phone                              email  \
0      Carl Todd      0800 1111  mauris.suspendisse@protonmail.net   
1  Imani Hartman  070 6524 1233                 tempor@hotmail.edu   
2    Lynn Carver  0980 367 7825        convallis.erat.eget@aol.org   
3  Sade Martinez     0845 46 43               dui.cum@hotmail.couk   
4   Uriel Romero   07381 542211             elit.nulla@outlook.com   

                   address postalZip        region    country  list  \
0  Ap #556-6076 Lorem, Av.     18838  Azad Kashmir    Ireland     1   
1      893-9946 Sociis Rd.    575444      Victoria      India    19   
2        9010 Auctor. Road     25252   East Region   Pakistan    17   
3      560-9011 Magna. St.    491187        Manisa   Pakistan     1   
4        596-1288 Amet Rd.      6700   Nova Scotia  Singapore    11   

                                                text  numberrange currency  \
0  pede ac urna. Ut tincidunt vehicula risus. Nul...            2    $3.61

### Inspecting Data
You can inspect the structure and summary of your data using the following methods:

- `info()`: Shows summary information about the DataFrame, such as the number of rows, column data types, and memory usage.
- `describe()`: Provides descriptive statistics for numeric columns.

In [24]:
print(df.info())  # Get an overview of the DataFrame
print(df.describe())  # Statistical summary for numeric columns


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 12 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   name          100 non-null    object
 1   phone         100 non-null    object
 2   email         100 non-null    object
 3   address       100 non-null    object
 4   postalZip     100 non-null    object
 5   region        100 non-null    object
 6   country       100 non-null    object
 7   list          100 non-null    int64 
 8   text          100 non-null    object
 9   numberrange   100 non-null    int64 
 10  currency      100 non-null    object
 11  alphanumeric  100 non-null    object
dtypes: int64(2), object(10)
memory usage: 9.5+ KB
None
             list  numberrange
count  100.000000   100.000000
mean    10.260000     4.850000
std      6.290316     2.731688
min      1.000000     0.000000
25%      5.000000     2.000000
50%     11.000000     5.000000
75%     15.000000     7.000000
max 

## 3. Data Selection
### Selecting Rows and Columns
You can select specific rows and columns in a DataFrame using labels or index positions.

#### Selecting Columns:

In [25]:
df['phone'] # Selecting a single column

0           0800 1111
1       070 6524 1233
2       0980 367 7825
3          0845 46 43
4        07381 542211
           ...       
95        0800 272602
96    (024) 0676 6116
97        0800 728475
98      0873 882 4844
99      076 9264 4467
Name: phone, Length: 100, dtype: object

In [26]:
df[['phone', 'name']] # Selecting multiple columns

Unnamed: 0,phone,name
0,0800 1111,Carl Todd
1,070 6524 1233,Imani Hartman
2,0980 367 7825,Lynn Carver
3,0845 46 43,Sade Martinez
4,07381 542211,Uriel Romero
...,...,...
95,0800 272602,Hermione Carr
96,(024) 0676 6116,Darrel Gates
97,0800 728475,Marshall Sykes
98,0873 882 4844,Hyatt Ashley


#### Selecting Rows:
You can select rows using label-based indexing (loc[]) or integer position-based indexing (iloc[]).

- `loc[]`: Selects data by row labels and column names.
- `iloc[]`: Selects data by index position.

In [27]:
df.loc[0] # Select a single row by label

name                                                    Carl Todd
phone                                                   0800 1111
email                           mauris.suspendisse@protonmail.net
address                                   Ap #556-6076 Lorem, Av.
postalZip                                                   18838
region                                               Azad Kashmir
country                                                   Ireland
list                                                            1
text            pede ac urna. Ut tincidunt vehicula risus. Nul...
numberrange                                                     2
currency                                                    $3.61
alphanumeric                                          NGM77DMT8HR
Name: 0, dtype: object

In [28]:
df.iloc[0] # Select a single row by index position

name                                                    Carl Todd
phone                                                   0800 1111
email                           mauris.suspendisse@protonmail.net
address                                   Ap #556-6076 Lorem, Av.
postalZip                                                   18838
region                                               Azad Kashmir
country                                                   Ireland
list                                                            1
text            pede ac urna. Ut tincidunt vehicula risus. Nul...
numberrange                                                     2
currency                                                    $3.61
alphanumeric                                          NGM77DMT8HR
Name: 0, dtype: object

### Slicing and Indexing
You can slice rows and columns from a DataFrame using loc[] and iloc[].

Example:

In [29]:
df.loc[0:2, 'name':'phone'] # Select specific rows and columns using loc

Unnamed: 0,name,phone
0,Carl Todd,0800 1111
1,Imani Hartman,070 6524 1233
2,Lynn Carver,0980 367 7825


In [30]:
df.iloc[0:2, [0, 2]] # Select rows by index and specific columns using iloc

Unnamed: 0,name,email
0,Carl Todd,mauris.suspendisse@protonmail.net
1,Imani Hartman,tempor@hotmail.edu


## 4. Filtering Data
### Conditional Filtering
You can filter data by applying conditions to the DataFrame.

**Example:**

In [31]:
df[df['numberrange'] < 1] # Filter rows where a column's value is greater than a threshold

Unnamed: 0,name,phone,email,address,postalZip,region,country,list,text,numberrange,currency,alphanumeric
34,Lawrence Callahan,0800 443215,diam.nunc@yahoo.couk,"310-365 Nec, Av.",6258 RJ,Limburg,United States,7,odio semper cursus. Integer mollis. Integer ti...,0,$50.25,JFH94FMU5JM
56,Lucian Poole,0381 848 2226,cubilia.curae@google.net,Ap #208-8133 Convallis St.,854031,Centre,Philippines,17,velit eu sem. Pellentesque ut ipsum ac mi elei...,0,$14.90,KUX22WPY5HF


### Boolean Indexing
You can apply conditions directly using Boolean indexing.

In [32]:
df[(df['numberrange'] < 20) & (df['country'] == 'Ireland')] # Using boolean conditions to filter rows

Unnamed: 0,name,phone,email,address,postalZip,region,country,list,text,numberrange,currency,alphanumeric
0,Carl Todd,0800 1111,mauris.suspendisse@protonmail.net,"Ap #556-6076 Lorem, Av.",18838,Azad Kashmir,Ireland,1,pede ac urna. Ut tincidunt vehicula risus. Nul...,2,$3.61,NGM77DMT8HR
12,Donovan Tillman,0800 273521,egestas.ligula@aol.edu,9284 Ut St.,155171,Arunachal Pradesh,Ireland,15,amet orci. Ut sagittis lobortis mauris. Suspen...,1,$54.16,ORE63FCY7PW
76,Uriah Brock,07624 982244,mauris.eu@aol.couk,863-6171 Id Street,11-804,Umbria,Ireland,13,"mauris id sapien. Cras dolor dolor, tempus non...",7,$91.65,ULS13XYB5QL
83,James Simpson,(0117) 280 6451,metus.in@outlook.com,Ap #480-5104 Primis Ave,95616,Minas Gerais,Ireland,11,nibh vulputate mauris sagittis placerat. Cras ...,1,$35.21,MMU46BNE3CS


## 5. Adding/Removing Columns
### Adding New Columns
You can add new columns to a DataFrame using assignment.

#### Example:

In [33]:
df['name_and_phone'] = df['name'] + df['phone']
df

Unnamed: 0,name,phone,email,address,postalZip,region,country,list,text,numberrange,currency,alphanumeric,name_and_phone
0,Carl Todd,0800 1111,mauris.suspendisse@protonmail.net,"Ap #556-6076 Lorem, Av.",18838,Azad Kashmir,Ireland,1,pede ac urna. Ut tincidunt vehicula risus. Nul...,2,$3.61,NGM77DMT8HR,Carl Todd0800 1111
1,Imani Hartman,070 6524 1233,tempor@hotmail.edu,893-9946 Sociis Rd.,575444,Victoria,India,19,eget odio. Aliquam vulputate ullamcorper magna...,1,$40.74,ROO25QJR7EJ,Imani Hartman070 6524 1233
2,Lynn Carver,0980 367 7825,convallis.erat.eget@aol.org,9010 Auctor. Road,25252,East Region,Pakistan,17,senectus et netus et malesuada fames ac turpis...,1,$84.87,OAM72YZF3QW,Lynn Carver0980 367 7825
3,Sade Martinez,0845 46 43,dui.cum@hotmail.couk,560-9011 Magna. St.,491187,Manisa,Pakistan,1,"dui augue eu tellus. Phasellus elit pede, male...",2,$36.73,HDQ00BJN2QA,Sade Martinez0845 46 43
4,Uriel Romero,07381 542211,elit.nulla@outlook.com,596-1288 Amet Rd.,6700,Nova Scotia,Singapore,11,vestibulum massa rutrum magna. Cras convallis ...,4,$45.69,VTX15TUS7TC,Uriel Romero07381 542211
...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,Hermione Carr,0800 272602,non.dapibus@protonmail.com,721-3192 Magna Rd.,26930,Queensland,Pakistan,19,"orci, adipiscing non, luctus sit amet, faucibu...",3,$6.14,UDS39INL1CX,Hermione Carr0800 272602
96,Darrel Gates,(024) 0676 6116,orci.phasellus@yahoo.ca,"P.O. Box 976, 7887 Diam St.",4144,West Papua,United Kingdom,1,amet ultricies sem magna nec quam. Curabitur v...,7,$79.54,PXO54TUV8NH,Darrel Gates(024) 0676 6116
97,Marshall Sykes,0800 728475,sapien.cras@protonmail.org,1495 Egestas. Rd.,18102,Utrecht,India,1,mi lacinia mattis. Integer eu lacus. Quisque i...,7,$25.30,IUC28LDY7MM,Marshall Sykes0800 728475
98,Hyatt Ashley,0873 882 4844,imperdiet@yahoo.org,"P.O. Box 490, 4963 Mauris Avenue",16882,Trøndelag,Vietnam,7,vestibulum. Mauris magna. Duis dignissim tempo...,2,$36.24,HWP81EEB2HK,Hyatt Ashley0873 882 4844


### Removing Columns
You can remove columns using the drop() method.

#### Example:

In [34]:
# Drop a single column
df.drop('name_and_phone', axis=1)

Unnamed: 0,name,phone,email,address,postalZip,region,country,list,text,numberrange,currency,alphanumeric
0,Carl Todd,0800 1111,mauris.suspendisse@protonmail.net,"Ap #556-6076 Lorem, Av.",18838,Azad Kashmir,Ireland,1,pede ac urna. Ut tincidunt vehicula risus. Nul...,2,$3.61,NGM77DMT8HR
1,Imani Hartman,070 6524 1233,tempor@hotmail.edu,893-9946 Sociis Rd.,575444,Victoria,India,19,eget odio. Aliquam vulputate ullamcorper magna...,1,$40.74,ROO25QJR7EJ
2,Lynn Carver,0980 367 7825,convallis.erat.eget@aol.org,9010 Auctor. Road,25252,East Region,Pakistan,17,senectus et netus et malesuada fames ac turpis...,1,$84.87,OAM72YZF3QW
3,Sade Martinez,0845 46 43,dui.cum@hotmail.couk,560-9011 Magna. St.,491187,Manisa,Pakistan,1,"dui augue eu tellus. Phasellus elit pede, male...",2,$36.73,HDQ00BJN2QA
4,Uriel Romero,07381 542211,elit.nulla@outlook.com,596-1288 Amet Rd.,6700,Nova Scotia,Singapore,11,vestibulum massa rutrum magna. Cras convallis ...,4,$45.69,VTX15TUS7TC
...,...,...,...,...,...,...,...,...,...,...,...,...
95,Hermione Carr,0800 272602,non.dapibus@protonmail.com,721-3192 Magna Rd.,26930,Queensland,Pakistan,19,"orci, adipiscing non, luctus sit amet, faucibu...",3,$6.14,UDS39INL1CX
96,Darrel Gates,(024) 0676 6116,orci.phasellus@yahoo.ca,"P.O. Box 976, 7887 Diam St.",4144,West Papua,United Kingdom,1,amet ultricies sem magna nec quam. Curabitur v...,7,$79.54,PXO54TUV8NH
97,Marshall Sykes,0800 728475,sapien.cras@protonmail.org,1495 Egestas. Rd.,18102,Utrecht,India,1,mi lacinia mattis. Integer eu lacus. Quisque i...,7,$25.30,IUC28LDY7MM
98,Hyatt Ashley,0873 882 4844,imperdiet@yahoo.org,"P.O. Box 490, 4963 Mauris Avenue",16882,Trøndelag,Vietnam,7,vestibulum. Mauris magna. Duis dignissim tempo...,2,$36.24,HWP81EEB2HK


In [35]:
# Drop multiple columns
df.drop(['name', 'phone'], axis=1)

Unnamed: 0,email,address,postalZip,region,country,list,text,numberrange,currency,alphanumeric,name_and_phone
0,mauris.suspendisse@protonmail.net,"Ap #556-6076 Lorem, Av.",18838,Azad Kashmir,Ireland,1,pede ac urna. Ut tincidunt vehicula risus. Nul...,2,$3.61,NGM77DMT8HR,Carl Todd0800 1111
1,tempor@hotmail.edu,893-9946 Sociis Rd.,575444,Victoria,India,19,eget odio. Aliquam vulputate ullamcorper magna...,1,$40.74,ROO25QJR7EJ,Imani Hartman070 6524 1233
2,convallis.erat.eget@aol.org,9010 Auctor. Road,25252,East Region,Pakistan,17,senectus et netus et malesuada fames ac turpis...,1,$84.87,OAM72YZF3QW,Lynn Carver0980 367 7825
3,dui.cum@hotmail.couk,560-9011 Magna. St.,491187,Manisa,Pakistan,1,"dui augue eu tellus. Phasellus elit pede, male...",2,$36.73,HDQ00BJN2QA,Sade Martinez0845 46 43
4,elit.nulla@outlook.com,596-1288 Amet Rd.,6700,Nova Scotia,Singapore,11,vestibulum massa rutrum magna. Cras convallis ...,4,$45.69,VTX15TUS7TC,Uriel Romero07381 542211
...,...,...,...,...,...,...,...,...,...,...,...
95,non.dapibus@protonmail.com,721-3192 Magna Rd.,26930,Queensland,Pakistan,19,"orci, adipiscing non, luctus sit amet, faucibu...",3,$6.14,UDS39INL1CX,Hermione Carr0800 272602
96,orci.phasellus@yahoo.ca,"P.O. Box 976, 7887 Diam St.",4144,West Papua,United Kingdom,1,amet ultricies sem magna nec quam. Curabitur v...,7,$79.54,PXO54TUV8NH,Darrel Gates(024) 0676 6116
97,sapien.cras@protonmail.org,1495 Egestas. Rd.,18102,Utrecht,India,1,mi lacinia mattis. Integer eu lacus. Quisque i...,7,$25.30,IUC28LDY7MM,Marshall Sykes0800 728475
98,imperdiet@yahoo.org,"P.O. Box 490, 4963 Mauris Avenue",16882,Trøndelag,Vietnam,7,vestibulum. Mauris magna. Duis dignissim tempo...,2,$36.24,HWP81EEB2HK,Hyatt Ashley0873 882 4844


## 6. Handling Missing Data
Missing data is common in real-world datasets. Pandas provides several functions to detect and handle missing values.

### Detecting Missing Data
- `isnull()`: Detects missing values (NaN).
- `notnull()`: Detects non-missing values

In [36]:
df.isnull()  # Returns True for NaN values

Unnamed: 0,name,phone,email,address,postalZip,region,country,list,text,numberrange,currency,alphanumeric,name_and_phone
0,False,False,False,False,False,False,False,False,False,False,False,False,False
1,False,False,False,False,False,False,False,False,False,False,False,False,False
2,False,False,False,False,False,False,False,False,False,False,False,False,False
3,False,False,False,False,False,False,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False,False,False,False,False,False,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,False,False,False,False,False,False,False,False,False,False,False,False,False
96,False,False,False,False,False,False,False,False,False,False,False,False,False
97,False,False,False,False,False,False,False,False,False,False,False,False,False
98,False,False,False,False,False,False,False,False,False,False,False,False,False


### Dropping Missing Data
You can drop rows or columns that contain missing values using dropna().

#### Example:

In [37]:
# Drop rows with any missing values
df.dropna()


Unnamed: 0,name,phone,email,address,postalZip,region,country,list,text,numberrange,currency,alphanumeric,name_and_phone
0,Carl Todd,0800 1111,mauris.suspendisse@protonmail.net,"Ap #556-6076 Lorem, Av.",18838,Azad Kashmir,Ireland,1,pede ac urna. Ut tincidunt vehicula risus. Nul...,2,$3.61,NGM77DMT8HR,Carl Todd0800 1111
1,Imani Hartman,070 6524 1233,tempor@hotmail.edu,893-9946 Sociis Rd.,575444,Victoria,India,19,eget odio. Aliquam vulputate ullamcorper magna...,1,$40.74,ROO25QJR7EJ,Imani Hartman070 6524 1233
2,Lynn Carver,0980 367 7825,convallis.erat.eget@aol.org,9010 Auctor. Road,25252,East Region,Pakistan,17,senectus et netus et malesuada fames ac turpis...,1,$84.87,OAM72YZF3QW,Lynn Carver0980 367 7825
3,Sade Martinez,0845 46 43,dui.cum@hotmail.couk,560-9011 Magna. St.,491187,Manisa,Pakistan,1,"dui augue eu tellus. Phasellus elit pede, male...",2,$36.73,HDQ00BJN2QA,Sade Martinez0845 46 43
4,Uriel Romero,07381 542211,elit.nulla@outlook.com,596-1288 Amet Rd.,6700,Nova Scotia,Singapore,11,vestibulum massa rutrum magna. Cras convallis ...,4,$45.69,VTX15TUS7TC,Uriel Romero07381 542211
...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,Hermione Carr,0800 272602,non.dapibus@protonmail.com,721-3192 Magna Rd.,26930,Queensland,Pakistan,19,"orci, adipiscing non, luctus sit amet, faucibu...",3,$6.14,UDS39INL1CX,Hermione Carr0800 272602
96,Darrel Gates,(024) 0676 6116,orci.phasellus@yahoo.ca,"P.O. Box 976, 7887 Diam St.",4144,West Papua,United Kingdom,1,amet ultricies sem magna nec quam. Curabitur v...,7,$79.54,PXO54TUV8NH,Darrel Gates(024) 0676 6116
97,Marshall Sykes,0800 728475,sapien.cras@protonmail.org,1495 Egestas. Rd.,18102,Utrecht,India,1,mi lacinia mattis. Integer eu lacus. Quisque i...,7,$25.30,IUC28LDY7MM,Marshall Sykes0800 728475
98,Hyatt Ashley,0873 882 4844,imperdiet@yahoo.org,"P.O. Box 490, 4963 Mauris Avenue",16882,Trøndelag,Vietnam,7,vestibulum. Mauris magna. Duis dignissim tempo...,2,$36.24,HWP81EEB2HK,Hyatt Ashley0873 882 4844


In [38]:
# Drop columns with any missing values
df.dropna(axis=1)

Unnamed: 0,name,phone,email,address,postalZip,region,country,list,text,numberrange,currency,alphanumeric,name_and_phone
0,Carl Todd,0800 1111,mauris.suspendisse@protonmail.net,"Ap #556-6076 Lorem, Av.",18838,Azad Kashmir,Ireland,1,pede ac urna. Ut tincidunt vehicula risus. Nul...,2,$3.61,NGM77DMT8HR,Carl Todd0800 1111
1,Imani Hartman,070 6524 1233,tempor@hotmail.edu,893-9946 Sociis Rd.,575444,Victoria,India,19,eget odio. Aliquam vulputate ullamcorper magna...,1,$40.74,ROO25QJR7EJ,Imani Hartman070 6524 1233
2,Lynn Carver,0980 367 7825,convallis.erat.eget@aol.org,9010 Auctor. Road,25252,East Region,Pakistan,17,senectus et netus et malesuada fames ac turpis...,1,$84.87,OAM72YZF3QW,Lynn Carver0980 367 7825
3,Sade Martinez,0845 46 43,dui.cum@hotmail.couk,560-9011 Magna. St.,491187,Manisa,Pakistan,1,"dui augue eu tellus. Phasellus elit pede, male...",2,$36.73,HDQ00BJN2QA,Sade Martinez0845 46 43
4,Uriel Romero,07381 542211,elit.nulla@outlook.com,596-1288 Amet Rd.,6700,Nova Scotia,Singapore,11,vestibulum massa rutrum magna. Cras convallis ...,4,$45.69,VTX15TUS7TC,Uriel Romero07381 542211
...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,Hermione Carr,0800 272602,non.dapibus@protonmail.com,721-3192 Magna Rd.,26930,Queensland,Pakistan,19,"orci, adipiscing non, luctus sit amet, faucibu...",3,$6.14,UDS39INL1CX,Hermione Carr0800 272602
96,Darrel Gates,(024) 0676 6116,orci.phasellus@yahoo.ca,"P.O. Box 976, 7887 Diam St.",4144,West Papua,United Kingdom,1,amet ultricies sem magna nec quam. Curabitur v...,7,$79.54,PXO54TUV8NH,Darrel Gates(024) 0676 6116
97,Marshall Sykes,0800 728475,sapien.cras@protonmail.org,1495 Egestas. Rd.,18102,Utrecht,India,1,mi lacinia mattis. Integer eu lacus. Quisque i...,7,$25.30,IUC28LDY7MM,Marshall Sykes0800 728475
98,Hyatt Ashley,0873 882 4844,imperdiet@yahoo.org,"P.O. Box 490, 4963 Mauris Avenue",16882,Trøndelag,Vietnam,7,vestibulum. Mauris magna. Duis dignissim tempo...,2,$36.24,HWP81EEB2HK,Hyatt Ashley0873 882 4844


### Filling Missing Data
You can replace missing values with a specific value using `fillna()`.

#### Example:

In [39]:
# Fill missing values with a specific value
df.fillna(0)

Unnamed: 0,name,phone,email,address,postalZip,region,country,list,text,numberrange,currency,alphanumeric,name_and_phone
0,Carl Todd,0800 1111,mauris.suspendisse@protonmail.net,"Ap #556-6076 Lorem, Av.",18838,Azad Kashmir,Ireland,1,pede ac urna. Ut tincidunt vehicula risus. Nul...,2,$3.61,NGM77DMT8HR,Carl Todd0800 1111
1,Imani Hartman,070 6524 1233,tempor@hotmail.edu,893-9946 Sociis Rd.,575444,Victoria,India,19,eget odio. Aliquam vulputate ullamcorper magna...,1,$40.74,ROO25QJR7EJ,Imani Hartman070 6524 1233
2,Lynn Carver,0980 367 7825,convallis.erat.eget@aol.org,9010 Auctor. Road,25252,East Region,Pakistan,17,senectus et netus et malesuada fames ac turpis...,1,$84.87,OAM72YZF3QW,Lynn Carver0980 367 7825
3,Sade Martinez,0845 46 43,dui.cum@hotmail.couk,560-9011 Magna. St.,491187,Manisa,Pakistan,1,"dui augue eu tellus. Phasellus elit pede, male...",2,$36.73,HDQ00BJN2QA,Sade Martinez0845 46 43
4,Uriel Romero,07381 542211,elit.nulla@outlook.com,596-1288 Amet Rd.,6700,Nova Scotia,Singapore,11,vestibulum massa rutrum magna. Cras convallis ...,4,$45.69,VTX15TUS7TC,Uriel Romero07381 542211
...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,Hermione Carr,0800 272602,non.dapibus@protonmail.com,721-3192 Magna Rd.,26930,Queensland,Pakistan,19,"orci, adipiscing non, luctus sit amet, faucibu...",3,$6.14,UDS39INL1CX,Hermione Carr0800 272602
96,Darrel Gates,(024) 0676 6116,orci.phasellus@yahoo.ca,"P.O. Box 976, 7887 Diam St.",4144,West Papua,United Kingdom,1,amet ultricies sem magna nec quam. Curabitur v...,7,$79.54,PXO54TUV8NH,Darrel Gates(024) 0676 6116
97,Marshall Sykes,0800 728475,sapien.cras@protonmail.org,1495 Egestas. Rd.,18102,Utrecht,India,1,mi lacinia mattis. Integer eu lacus. Quisque i...,7,$25.30,IUC28LDY7MM,Marshall Sykes0800 728475
98,Hyatt Ashley,0873 882 4844,imperdiet@yahoo.org,"P.O. Box 490, 4963 Mauris Avenue",16882,Trøndelag,Vietnam,7,vestibulum. Mauris magna. Duis dignissim tempo...,2,$36.24,HWP81EEB2HK,Hyatt Ashley0873 882 4844


In [40]:
# Fill missing values with the mean of the column
df['phone'].fillna(df['numberrange'].mean())

0           0800 1111
1       070 6524 1233
2       0980 367 7825
3          0845 46 43
4        07381 542211
           ...       
95        0800 272602
96    (024) 0676 6116
97        0800 728475
98      0873 882 4844
99      076 9264 4467
Name: phone, Length: 100, dtype: object