### Objectives

After completing this lab you will be able to:

- Create a database
- Create a table
- Insert data into the table
- Query data from the table
- Retrieve the result set into a pandas dataframe
- Retrieve basic information
- Filtering, Selecting, Indexing data
- Sorting, Adding, Renaming, or Dropping Columns
- Aggregations and Grouping
- Handling Missing Data
- Applying Functions
- Exporting the DataFrame
- Close the database connection

In [5]:
import sqlite3 ## sqlite3 is a built in library
# Connecting to sqlite
# connection object
conn = sqlite3.connect('INSTRUCTOR.db')

Cursor class is an instance using which you can invoke methods that execute SQLite statements, fetch data from the result sets of the queries. You can create Cursor object using the cursor() method of the Connection object/class.

In [6]:
# cursor object
cursor_obj = conn.cursor()

### Task 2: Create a table in the database
In this step we will create INSTRUCTOR table in the database :

In [7]:
# Drop the table if already exists.
cursor_obj.execute("DROP TABLE IF EXISTS INSTRUCTOR")

<sqlite3.Cursor at 0x145d4b454c0>

In [8]:
# Creating table
table = """ create table IF NOT EXISTS INSTRUCTOR(ID INTEGER PRIMARY KEY NOT NULL,
FNAME VARCHAR(20), LNAME VARCHAR(20), CITY VARCHAR(20), CCODE CHAR(2));"""
 
cursor_obj.execute(table)
 
print("Table is Ready")

Table is Ready


#### Task 3: Insert data into the table

In this step we will insert some rows of data into the table.

The INSTRUCTOR table we created in the previous step will contain 3 rows of data.

In [9]:
cursor_obj.execute('''insert into INSTRUCTOR values (1, 'Rav', 'Ahuja', 'TORONTO', 'CA')''')

<sqlite3.Cursor at 0x145d4b454c0>

In [10]:
cursor_obj.execute('''insert into INSTRUCTOR values (2, 'Raul', 'Chong', 'Markham', 'CA'), (3, 'Hima', 'Vasudevan', 'Chicago', 'US')''')

<sqlite3.Cursor at 0x145d4b454c0>

### Task 4: Query data in the table

In this step we will retrieve data we inserted into the INSTRUCTOR table.

In [12]:
statement = '''SELECT * FROM INSTRUCTOR'''
cursor_obj.execute(statement)

print("All the data")
output_all = cursor_obj.fetchall()
for row_all in output_all:
  print(row_all)


All the data
(1, 'Rav', 'Ahuja', 'TORONTO', 'CA')
(2, 'Raul', 'Chong', 'Markham', 'CA')
(3, 'Hima', 'Vasudevan', 'Chicago', 'US')


In [22]:
## Fetch few rows from the table
statement = '''SELECT * FROM INSTRUCTOR'''
cursor_obj.execute(statement)
  
print("All the data")
# If you want to fetch few rows from the table we use fetchmany(numberofrows) and mention the number how many rows you want to fetch
output_many = cursor_obj.fetchmany(2) 
for row_many in output_many:
  print(row_many)

All the data
(1, 'Rav', 'Ahuja', 'MOOSETOWN', 'CA')
(2, 'Raul', 'Chong', 'Markham', 'CA')


In [23]:
# Fetch only FNAME from the table
statement = '''SELECT FNAME FROM INSTRUCTOR'''
cursor_obj.execute(statement)
  
print("All the data")
output_column = cursor_obj.fetchall()
for fetch in output_column:
  print(fetch)

All the data
('Rav',)
('Raul',)
('Hima',)


### Write and execute an update statement that changes the Rav's CITY to MOOSETOWN

In [24]:
# Update Rav's CITY to MOOSETOWN
query_update='''update INSTRUCTOR set CITY='MOOSETOWN' where FNAME="Rav"'''
cursor_obj.execute(query_update)
print("All the data")  
output1 = cursor_obj.fetchmany(2)
for row in output1:
  print(row)

All the data


### Task 5: Retrieve data into Pandas
In this step we will retrieve the contents of the INSTRUCTOR table into a Pandas dataframe


In [25]:
import pandas as pd
#retrieve the query results into a pandas dataframe
df = pd.read_sql_query("select * from instructor;", conn)

#print the dataframe
df

  from pandas.core import (


Unnamed: 0,ID,FNAME,LNAME,CITY,CCODE
0,1,Rav,Ahuja,MOOSETOWN,CA
1,2,Raul,Chong,Markham,CA
2,3,Hima,Vasudevan,Chicago,US


In [26]:
#print just the LNAME for first row in the pandas data frame
df.LNAME[0]

'Ahuja'

In [27]:
df.shape

(3, 5)

In [28]:
from IPython.display import display
display(df)

Unnamed: 0,ID,FNAME,LNAME,CITY,CCODE
0,1,Rav,Ahuja,MOOSETOWN,CA
1,2,Raul,Chong,Markham,CA
2,3,Hima,Vasudevan,Chicago,US


With a DataFrame like df in pandas, you can perform various typical operations for data exploration, analysis, and transformation. Here are some common ones to get started:

## 1. Basic Information and Summary
- df.head() – View the first few rows.
- df.info() – Get a summary of the DataFrame, including column names, types, and non-null counts.
- df.describe() – Get summary statistics for numeric columns.

In [32]:
#df.head() # View the first few rows.
#df.info() # Get a summary of the DataFrame, including column names, types, and non-null counts.
df.describe() # Get summary statistics for numeric columns.

Unnamed: 0,ID
count,3.0
mean,2.0
std,1.0
min,1.0
25%,1.5
50%,2.0
75%,2.5
max,3.0


## 2. Indexing, Selecting, and Filtering Data



In [34]:
df[['FNAME', 'LNAME']] # Select specific columns.


Unnamed: 0,FNAME,LNAME
0,Rav,Ahuja
1,Raul,Chong
2,Hima,Vasudevan


In [35]:
df.iloc[0] # Access rows by index.

ID               1
FNAME          Rav
LNAME        Ahuja
CITY     MOOSETOWN
CCODE           CA
Name: 0, dtype: object

In [36]:
df[df['CITY'] == 'Chicago'] #– Filter rows based on column values.

Unnamed: 0,ID,FNAME,LNAME,CITY,CCODE
2,3,Hima,Vasudevan,Chicago,US


## 3. Sorting
df.sort_values(by='LNAME') – Sort by a specific column.
df.sort_values(by=['CITY', 'LNAME'], ascending=[True, False]) – Sort by multiple columns.

In [37]:
df.sort_values(by='LNAME') # Sort by a specific column.
df.sort_values(by=['CITY', 'LNAME'], ascending=[True, False]) # Sort by multiple columns.

Unnamed: 0,ID,FNAME,LNAME,CITY,CCODE
2,3,Hima,Vasudevan,Chicago,US
0,1,Rav,Ahuja,MOOSETOWN,CA
1,2,Raul,Chong,Markham,CA


## 4. Adding, Renaming, or Dropping Columns

- df['FullName'] = df['FNAME'] + ' ' + df['LNAME'] – Create a new column by combining others.
- df.rename(columns={'FNAME': 'FirstName'}) – Rename a column.
- df.drop(columns=['CCODE']) – Drop a column.

In [38]:

df['FullName'] = df['FNAME'] + ' ' + df['LNAME'] # Create a new column by combining others.
df.rename(columns={'FNAME': 'FirstName'}) # Rename a column.
#df.drop(columns=['CCODE']) # Drop a column.

Unnamed: 0,ID,FirstName,LNAME,CITY,CCODE,FullName
0,1,Rav,Ahuja,MOOSETOWN,CA,Rav Ahuja
1,2,Raul,Chong,Markham,CA,Raul Chong
2,3,Hima,Vasudevan,Chicago,US,Hima Vasudevan


## 5. Aggregations and Grouping


In [39]:
df.groupby('CITY').size() # Count occurrences by group.
df.groupby('CCODE')['ID'].count() # Count IDs for each unique country code.
df.groupby('CCODE').agg({'ID': 'count', 'CITY': 'nunique'}) # Apply multiple aggregations.

Unnamed: 0_level_0,ID,CITY
CCODE,Unnamed: 1_level_1,Unnamed: 2_level_1
CA,2,2
US,1,1


## 6. Handling Missing Data


In [40]:
df.fillna('Unknown') # Fill missing values with a default.
df.dropna() # Drop rows with missing values.

Unnamed: 0,ID,FNAME,LNAME,CITY,CCODE,FullName
0,1,Rav,Ahuja,MOOSETOWN,CA,Rav Ahuja
1,2,Raul,Chong,Markham,CA,Raul Chong
2,3,Hima,Vasudevan,Chicago,US,Hima Vasudevan


## 7. Applying Functions


In [41]:
df['CITY'] = df['CITY'].apply(lambda x: x.upper()) # Apply functions to columns.

## 8. Merging and Joining with Other DataFrames

You can use pd.merge() to combine df with another DataFrame on a common column.

## 9. Exporting the DataFrame


In [42]:
df.to_csv('output.csv') # Export the DataFrame to a CSV file.
df.to_excel('output.xlsx') # Export to Excel.

In [43]:
# Close the connection
conn.close()