# Pandas Tutorial for Beginners and Advanced Level

*Definition*: Pandas is an open-source Python library used for data manipulation, analysis, and cleaning. It provides fast, flexible, and expressive data structures like ```DataFrame``` and ```Series``` to work with structured data efficiently.

### Key Features:
* Handles structured data (CSV, Excel, SQL, JSON, etc.).
* Powerful data operations (filtering, grouping, merging, reshaping).
* Built-in handling of missing data (NaN values).
* Supports time-series analysis and multi-indexing.
* Integrates with NumPy, Matplotlib, and Scikit-learn for data science tasks.

### Usage:

```import pandas as pd```
### What kind of data does pandas handle?
Pandas can handle a wide variety of structured and semi-structured data types, including
- CSV
- Excel
- SQL
- HTML
- XML
- JSON 
- Time series data
- Textual data












Pandas primarily uses two data structures:

* ```Series```: A one-dimensional array-like object (e.g., a single column of data).

* ```DataFrame```: A two-dimensional table with rows and columns (like a spreadsheet).



### How to Read and Write Tabular Data?
* Pandas provides functions to read and write data in various formats.
* To read data we use ``` pd.read_*("path/to/dataset/data.*") ```
* To write data we use ```pd.to_*("path/to/dataset/data.*") ```
  
  -- Note: * representes the data types, for example if the data is ```.csv``` we use ``` pd.read_csv("path/to/dataset/data.csv") ``` and if it excel data we replace .xlsx and so

#### Read data example

In [None]:
import pandas as pd

# Read CSV file
df = pd.read_csv('new_data.csv')

# Read Excel file
df = pd.read_excel('new_data.xlsx')

# Read SQL table
from sqlalchemy import create_engine
engine = create_engine('sqlite:///database.db')
df = pd.read_sql('SELECT * FROM table_name', engine)

# Read JSON data
df = pd.read_json('data.json')

##### To read data from sql table, we should create a connection with the database first. 

In [None]:
import pandas as pd
from sqlalchemy import create_engine

# Define database connection (Replace with your details)
db_user = "root"
db_password = ""
db_host = "localhost"  # Or your database server
db_name = "my_db"

In [7]:
# Create engine
engine = create_engine(f"mysql+mysqlconnector://{db_user}:{db_password}@{db_host}/{db_name}")

In [8]:
# Read data from a table into Pandas DataFrame
table_name = "users"

In [9]:
df = pd.read_sql(f"SELECT * FROM {table_name}", engine)

In [None]:
df.head()

# Write data examples

In [None]:
# Write to CSV
df.to_csv('output.csv', index=False)
# Write to Excel
df.to_excel('output.xlsx', index=False)
# Write to SQL
df.to_sql('table_name', engine, if_exists='replace')
# Write to JSON
df.to_json('output.json')

In [12]:
new_data = pd.read_excel('new_data.xlsx')

In [None]:
df = new_data.drop(['phone'], axis=1)


In [15]:
df.to_csv('data.csv', index=False)