# Reading from Different Files and Converting into DataFrames

## Reading from CSV
CSV (Comma-Separated Values) is a common format for storing tabular data. Pandas can read CSV files using pd.read_csv.

In [11]:
# Reading data from a CSV file
import pandas as pd

# Replace 'your_file.csv' with the path to your CSV file
df_csv = pd.read_csv('office.csv')
df_csv.head()


Unnamed: 0,name,marks,city
0,Gaurav,96,Gaya
1,Navin Sir,98,Bengaluru
2,Harsh Bhaiya,85,Jodhpur
3,Sushil,88,Bikaner


## Reading from Excel
Excel files are widely used for data storage and manipulation. Pandas can read data from .xls and .xlsx files.

In [12]:
# Reading data from an Excel file
# Make sure to install openpyxl if you are reading .xlsx files: pip install openpyxl
df_excel = pd.read_excel('sample_data.xlsx', sheet_name='Sheet1')
df_excel.head()


Unnamed: 0,Name,Age,Score
0,Gaurav,24,85
1,Navin Sir,27,90
2,Harsh Bhaiya,22,78
3,Sushil,32,88
4,Rocky,29,95


## Reading from JSON
JSON (JavaScript Object Notation) is a lightweight data format often used in web applications. Pandas can read JSON files and convert them into DataFrames.

In [13]:
# Reading data from a JSON file
# Replace 'your_file.json' with the path to your JSON file
df_json = pd.read_json('sample_data.json')
df_json.head()


Unnamed: 0,Name,Age,Score
0,Aarav,24,85
1,Priya,27,90
2,Rohan,22,78
3,Sneha,32,88
4,Vikram,29,95


## Reading from SQL Database- PostgreSQL
Pandas can directly query SQL databases and retrieve data as DataFrames, provided you have a connection string.

In [14]:
# Importing necessary libraries- SQLAlchemy and psycopg2
from sqlalchemy import create_engine

# Creating the connection string
# connection_string = f'postgresql://{username}:{password}@{host}:{port}/{db_name}'
connection_string = "postgresql://postgres:12345@localhost:5432/employeeDb"

engine = create_engine(connection_string)

# Reading data from the 'employee' table
df_sql = pd.read_sql('SELECT * FROM employee', engine)
df_sql.head()


Unnamed: 0,employee_id,name,email,department,company
0,1,Aarav Kumar,aarav.kumar@example.com,HR,Tech Innovations Pvt Ltd
1,2,Diya Sharma,diya.sharma@example.com,Marketing,Creative Minds Ltd
2,3,Rohan Gupta,rohan.gupta@example.com,Finance,Financial Solutions Inc
3,4,Isha Patel,isha.patel@example.com,IT,Tech Solutions Pvt Ltd
4,5,Aditya Singh,aditya.singh@example.com,Operations,Manufacturing Corp


## Reading from XML File
#### To read data from an XML file into a Pandas DataFrame, you can use the pd.read_xml() function, which was introduced in Pandas 1.3.0. This function requires the lxml or xml library to parse XML data, so make sure you have lxml installed: pip install lxml

In [16]:
df_xml = pd.read_xml('sample_data.xml')
df_xml.head()


Unnamed: 0,Name,Age,Score
0,Aarav,24,85
1,Priya,27,90
2,Rohan,22,78
3,Sneha,32,88
4,Vikram,29,95
