# Pandas Overview and Operations
This notebook will guide you through the fundamentals of the Pandas library and its data structures.
We will explore key concepts like:
1. Overview of Pandas and its data structures
2. Creating and working with Pandas Series
3. Creating and working with DataFrames
4. File operations with Pandas


In [None]:
# 1. Importing pandas library
import pandas as pd
import numpy as np

# 2. Creating a Pandas Series from different sources
data_list = [10, 20, 30, 40]
s1 = pd.Series(data_list)
print('Series from List:')
print(s1)

data_array = np.array([50, 60, 70, 80])
s2 = pd.Series(data_array)
print('\nSeries from NumPy Array:')
print(s2)

data_dict = {'a': 100, 'b': 200, 'c': 300}
s3 = pd.Series(data_dict)
print('\nSeries from Dictionary:')
print(s3)

# 3. Creating DataFrames
data = {
    'Name': ['Ali', 'Sara', 'Hina'],
    'Age': [20, 22, 21],
    'Marks': [85, 90, 80]
}
df = pd.DataFrame(data)
print('\nDataFrame:')
print(df)

# 4. File Operations
## Reading data from a CSV file
# Assuming 'data.csv' file exists
# df_csv = pd.read_csv('data.csv')
# print(df_csv.head())

# Writing DataFrame to a JSON file
df.to_json('output.json', orient='records', indent=4)
print('\nJSON file written to output.json')

### Explanation of Key Concepts
In this section, we explained the following:
1. **Creating Series**: From lists, NumPy arrays, dictionaries, and scalar values.
2. **Creating DataFrames**: From dictionaries, lists, and NumPy arrays.
3. **File Operations**: Reading from and writing to CSV, Excel, and JSON files.


In [4]:
# 5. Reading from a CSV file (from URL)
url = 'https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv'
df_url = pd.read_csv(url)
print('\nData read from remote CSV:')
print(df_url.head())

# 6. Writing to Excel file
#df.to_excel('students_output.xlsx', index=False)
#print('\nExcel file written to students_output.xlsx')

# 7. Reading and Writing JSON
df.to_json('students_output.json', orient='records', indent=4)
print('\nJSON file written to students_output.json')


Data read from remote CSV:
   sepal_length  sepal_width  petal_length  petal_width species
0           5.1          3.5           1.4          0.2  setosa
1           4.9          3.0           1.4          0.2  setosa
2           4.7          3.2           1.3          0.2  setosa
3           4.6          3.1           1.5          0.2  setosa
4           5.0          3.6           1.4          0.2  setosa

JSON file written to students_output.json


### Bonus Section
Additional functionalities covered include:
- File I/O: How to read from CSV, Excel, and JSON files.
- Writing data back to these files.
- Operations on DataFrames like filtering, slicing, and grouping.

This concludes the basics of the Pandas library for now. Further topics like data cleaning, merging, and visualization can be explored later.

In [5]:
# Importing pandas for data manipulation
import pandas as pd

# Sample Data for Pivot and Pivot Table examples
df = pd.DataFrame({
    'city': ['Lahore', 'Lahore', 'Karachi', 'Karachi'],
    'day': ['Mon', 'Tue', 'Mon', 'Tue'],
    'temperature': [30, 32, 29, 31]
})

# 1. Reshaping data using pivot() method
result = df.pivot(index='city', columns='day', values='temperature')
print('Pivot result:')
print(result)

# 2. Reshaping data using pivot_table() with aggregation
df2 = pd.DataFrame({
    'city': ['Lahore', 'Lahore', 'Karachi', 'Karachi', 'Lahore'],
    'day': ['Mon', 'Mon', 'Mon', 'Tue', 'Tue'],
    'temperature': [30, 32, 29, 31, 35]
})
pivot_result = df2.pivot_table(index='city', columns='day', values='temperature', aggfunc='mean')
print('\nPivot Table with mean temperature:')
print(pivot_result)

# 3. Reshaping data using melt() method
df3 = pd.DataFrame({
    'city': ['Lahore', 'Karachi'],
    'Mon': [30, 29],
    'Tue': [32, 31]
})
melted_df = pd.melt(df3, id_vars='city', value_vars=['Mon', 'Tue'], var_name='day', value_name='temperature')
print('\nMelted DataFrame:')
print(melted_df)

# 4. Reshaping data using crosstab() method for frequency tables
df4 = pd.DataFrame({
    'gender': ['M', 'F', 'M', 'M', 'F'],
    'language': ['Python', 'Java', 'Python', 'C++', 'Java']
})
crosstab_result = pd.crosstab(df4['gender'], df4['language'])
print('\nCrosstab frequency table:')
print(crosstab_result)


Pivot result:
day      Mon  Tue
city             
Karachi   29   31
Lahore    30   32

Pivot Table with mean temperature:
day       Mon   Tue
city               
Karachi  29.0  31.0
Lahore   31.0  35.0

Melted DataFrame:
      city  day  temperature
0   Lahore  Mon           30
1  Karachi  Mon           29
2   Lahore  Tue           32
3  Karachi  Tue           31

Crosstab frequency table:
language  C++  Java  Python
gender                     
F           0     2       0
M           1     0       2
