# Final Project Part I (Due: Saturday December 9, 2023 11:59pm

In this final project, you will work through the Database Design, ETL and Analytics phases using a multii-year Chicago crime data source. This final project is divided into two parts.  Part I fill focus on the database design and ETL.  Part II will focus on the analytics.

In this Part I you will conduct the following tasks:

1. Reverse engineer an existing sourse RDBMS using metadata SQL queries to identify the table and attribute details necessary for creating tables and an entity-relationship diagram depecting the database logical structure. The source data is an SQLite database (you might consider producing and ERD for this Sqlite database for your internal use, but **NOT REQUIRED to turn-in**).
2. Implement a set of tables using DDL in your SSO dsa_student database schema on the postgres server that replicates the source database structure. **Be sure to critically examine the source database structure, columns, constraints, and relationships (Foreign Key References) for accuracy**.  Ensure you have required data types and use the same exact table names as specified for the destination pgsql database.
3. Create an Entity Relationship Diagram for the **destination postgresql "database" tables** (not the Sqlite db).
4. Establish connections to the source and destination databases.
5. Extract the source data from tables, Transform values as required and Load into the destination tables.
6. Validate the ETL process by confirming row counts in both source and destination database tables.


Specific resourses and steps are listed below:

## Source SQLite Database

* Dataset URL: **/dsa/data/DSA-7030/cc23_7030.sqlite.db**
* Data Dictionary: [pdf](./ChicagoData-Description.pdf)
* [Chicago Crimes 2001-Present Dashboard](https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present-Dashboard/5cd6-ry5g)

This SQLite database consists of a set of normalized relations populated with publically available Chicago crime data for the years 2001 to 2022.  

## Database exploration

The cells below provide SQL DML statments for examining the underlying metadata in the SQLite database that describes the table, column, and relationship details.  An initial connection and subsequent SQL statements are provided for acquiring the information necessary for reconstructing the table and relational structure in your postgres SSO database.

In [1]:
#Load extention and connect to database
%load_ext sql
%sql sqlite:////dsa/data/DSA-7030/cc23_7030.sqlite.db

'Connected: @/dsa/data/DSA-7030/cc23_7030.sqlite.db'

## Explore the SQLite Tables List

This quiery simply lists the names of the database tables.  Here's a quick reference discussing the sqlite_master table showing its utility.  [sqlite_master table meta data](https://bw.org/2019/02/25/finding-meta-data-in-sqlite/) - explore what metadata is provided.

In [2]:
%%sql
SELECT distinct m.type, m.tbl_name --m.sql
FROM sqlite_master AS m,
     pragma_table_info(m.name) AS t
WHERE m.type = 'table'
order by m.name, t.pk DESC

 * sqlite:////dsa/data/DSA-7030/cc23_7030.sqlite.db
Done.


type,tbl_name
table,cc23_case_location
table,cc23_cases
table,cc23_fbi_nibrs_categories
table,cc23_fbi_nibrs_offense_categories
table,cc23_iucr_codes
table,cc23_iucr_codes_primary_descriptions
table,cc23_iucr_codes_secondary_descriptions
table,cc23_nibrs_crimes_against
table,cc23_nibrs_fbicode_offenses
table,cc23_nibrs_offenses_crimes_aginst


## Explore Column Details

The query below provdes the complete list of tables and their columns with important details.

* **tbl_name** = Name of the table
* **name** = column name
* **type** = declared data type
* **notnull** = indicates column declared as NOT NULL
* **pk** = indicates column is the primary key

In [3]:
%%sql
SELECT m.tbl_name, t.* --m.sql
 FROM pragma_table_info(m.tbl_name) t, sqlite_master m WHERE m.type='table';

 * sqlite:////dsa/data/DSA-7030/cc23_7030.sqlite.db
Done.


tbl_name,cid,name,type,notnull,dflt_value,pk
cc23_iucr_codes,0,iucr_code,varchar(10),0,,1
cc23_iucr_codes,1,iucr_index_code,char,0,,0
cc23_iucr_codes_primary_descriptions,0,iucr_code,varchar(10),0,,1
cc23_iucr_codes_primary_descriptions,1,iucr_primary_desc,varchar(100),0,,0
cc23_iucr_codes_secondary_descriptions,0,iucr_code,varchar(10),0,,1
cc23_iucr_codes_secondary_descriptions,1,iucr_secondary_desc,varchar(100),0,,0
cc23_fbi_nibrs_categories,0,fbi_nibrs_category_name,varchar(50),0,,1
cc23_fbi_nibrs_offense_categories,0,nibrs_offense_code,varchar(10),1,,1
cc23_fbi_nibrs_offense_categories,1,fbi_nibrs_category_name,varchar(50),0,,0
cc23_nibrs_crimes_against,0,nibrs_crime_against,varchar(20),1,,1


## Below query provdes the list of columns that are declared "unique" for referential integrity enforcement.

<u>Query Output Descriptions</u>
* **name** = the table name begining at the "cc_" -- cc_case_location is table name.
* **unique** = indicates the column is declared "unique"
* **origin** = indicates the columns is declared as primary key
* **name_1** = column name

In [4]:
%%sql
select il.*,ii.* --,m.sql
    from sqlite_master m, 
    pragma_index_list( m.name ) as il,
    pragma_index_info(il.name) as ii

 * sqlite:////dsa/data/DSA-7030/cc23_7030.sqlite.db
Done.


seq,name,unique,origin,partial,seqno,cid,name_1
0,sqlite_autoindex_cc23_iucr_codes_1,1,pk,0,0,0,iucr_code
0,sqlite_autoindex_cc23_iucr_codes_primary_descriptions_1,1,pk,0,0,0,iucr_code
0,sqlite_autoindex_cc23_iucr_codes_secondary_descriptions_1,1,pk,0,0,0,iucr_code
0,sqlite_autoindex_cc23_fbi_nibrs_categories_1,1,pk,0,0,0,fbi_nibrs_category_name
0,sqlite_autoindex_cc23_fbi_nibrs_offense_categories_1,1,pk,0,0,0,nibrs_offense_code
0,sqlite_autoindex_cc23_nibrs_crimes_against_1,1,pk,0,0,0,nibrs_crime_against
0,sqlite_autoindex_cc23_cases_1,1,pk,0,0,0,case_number
0,sqlite_autoindex_cc23_nibrs_fbicode_offenses_1,1,pk,0,0,0,nibrs_offense_code
0,sqlite_autoindex_cc23_nibrs_offenses_crimes_aginst_1,1,pk,0,0,0,nibrs_crime_against
0,sqlite_autoindex_cc23_nibrs_offenses_crimes_aginst_1,1,pk,0,1,1,nibrs_offense_code


## Explore Relationship Details (get foreign key references)

The below query exracts the details describing the foreign key referenes bewtween tables.

* **from_table** = the name of the one-side table
* **from_column** = the name of the foreign key column in the one-side table
* **to_table** = the name of the many-side reference table
* **to_column** = the name of the foreign key column in the one-side reference table

These metadata can be translated to the necessary SQL statement to establish a relationship between tables:

```SQL
FOREIGN KEY (<from_column>) REFERENCES <to_table>(<to_column>)
```

In [5]:
%%sql
SELECT 
    m.name as from_table, f.'from' as from_column, f.'table' as to_table, f.'to' as to_column --, m.sql
FROM
    sqlite_master m
    JOIN pragma_foreign_key_list(m.name) f ON m.name != f."table"
WHERE m.type = 'table'
ORDER BY m.name
;

 * sqlite:////dsa/data/DSA-7030/cc23_7030.sqlite.db
Done.


from_table,from_column,to_table,to_column
cc23_case_location,case_number,cc23_cases,case_number
cc23_cases,iucr_code,cc23_iucr_codes,iucr_code
cc23_fbi_nibrs_offense_categories,fbi_nibrs_category_name,cc23_fbi_nibrs_categories,fbi_nibrs_category_name
cc23_iucr_codes,iucr_code,cc23_cases,iucr_code
cc23_iucr_codes_primary_descriptions,iucr_code,cc23_iucr_codes,iucr_code
cc23_iucr_codes_secondary_descriptions,iucr_code,cc23_iucr_codes,iucr_code
cc23_nibrs_fbicode_offenses,nibrs_offense_code,cc23_cases,nibrs_fbi_offense_code
cc23_nibrs_fbicode_offenses,nibrs_offense_code,cc23_nibrs_offense_categories,nibrs_offense_code
cc23_nibrs_offenses_crimes_aginst,nibrs_offense_code,cc23_nibrs_fbicode_offenses,nibrs_offense_code
cc23_nibrs_offenses_crimes_aginst,nibrs_crime_against,cc23_nibrs_crimes_against,nibrs_crime_against


## Using the metadata from above:

## Implement the required CREATE TABLE statements for establishing the Chicago Crime Database in your SSO dsa_student database.  

The SQL statement takes this form:

```SQL
CREATE TABLE SSO.cc23_tbl_name (
 column_name_1 data_type <unqiue, not null>,
 column_name_N data_type <unqiue, not null>,
 PRIMARY KEY (<column_name>),
 <FOREIGN KEY (from_column_name) REFERENCES <SSO.cc23_to_table_name>(to_column_name)
 );
```

**The database tables, column names, and data types created in your SSO postgres server dsa_student database should be named exactly as they appear (with necessary modifications for any constraint or reference anomalies) in the ```cc23_7030.sqlite.db``` SQLite database.**

Use as many cells as desired.

# Connect to your SSO database using %sql magic or sqlAlchmey connection and implement your database structure (create table...)

In [6]:
#implement tables in SSO database

import psycopg2
import getpass

database = "dsa_student"
user = input("Type username (pawprint) and hit enter: ")
password = getpass.getpass("Type password and hit enter: ")

connection = psycopg2.connect(database = database,
                              user     = user,
                              host     = 'pgsql.dsa.lan',
                              password = password)

Type username (pawprint) and hit enter: jsmm8
Type password and hit enter: ········


In [8]:
with connection, connection.cursor() as cursor:
    cursor.execute("DROP TABLE IF EXISTS jsmm8.cc23_nibrs_offenses_crimes_aginst CASCADE;")
    cursor.execute("DROP TABLE IF EXISTS jsmm8.cc23_nibrs_fbicode_offenses CASCADE;")
    cursor.execute("DROP TABLE IF EXISTS jsmm8.cc23_cases CASCADE;")
    cursor.execute("DROP TABLE IF EXISTS jsmm8.cc23_case_location CASCADE;")
    cursor.execute("DROP TABLE IF EXISTS jsmm8.cc23_iucr_codes_secondary_descriptions CASCADE;")
    cursor.execute("DROP TABLE IF EXISTS jsmm8.cc23_iucr_codes_primary_descriptions CASCADE;")
    cursor.execute("DROP TABLE IF EXISTS jsmm8.cc23_iucr_codes CASCADE;")
    cursor.execute("DROP TABLE IF EXISTS jsmm8.cc23_fbi_nibrs_categories CASCADE;")
    cursor.execute("DROP TABLE IF EXISTS jsmm8.cc23_nibrs_crimes_against CASCADE;")
    cursor.execute("DROP TABLE IF EXISTS jsmm8.cc23_fbi_nibrs_offense_categories CASCADE;")

In [9]:
with connection, connection.cursor() as cursor:

    cursor.execute(
        '''
        CREATE TABLE jsmm8.cc23_iucr_codes (
            iucr_code varchar(10) PRIMARY KEY,
            iucr_index_code char
        );
        '''
    )

    cursor.execute("SELECT COLUMN_NAME FROM information_schema.COLUMNS WHERE TABLE_NAME ='cc23_iucr_codes' and TABLE_SCHEMA=\'%s\';" % user)
    results = cursor.fetchall()

print("Columns in table:")
for row in results:
    print(row)

Columns in table:
('iucr_code',)
('iucr_index_code',)


In [10]:
with connection, connection.cursor() as cursor:

    cursor.execute(
        '''
        CREATE TABLE jsmm8.cc23_iucr_codes_primary_descriptions (
            iucr_code varchar(10) PRIMARY KEY REFERENCES jsmm8.cc23_iucr_codes(iucr_code),
            iucr_primary_desc varchar(100),
            FOREIGN KEY (iucr_code) REFERENCES jsmm8.cc23_iucr_codes(iucr_code)
        );
        '''
    )

    cursor.execute("SELECT COLUMN_NAME FROM information_schema.COLUMNS WHERE TABLE_NAME ='cc23_iucr_codes_primary_descriptions' and TABLE_SCHEMA=\'%s\';" % user)
    results = cursor.fetchall()

print("Columns in table:")
for row in results:
    print(row)

Columns in table:
('iucr_code',)
('iucr_primary_desc',)


In [11]:
with connection, connection.cursor() as cursor:

    cursor.execute(
        '''
        CREATE TABLE jsmm8.cc23_iucr_codes_secondary_descriptions (
            iucr_code varchar(10) PRIMARY KEY REFERENCES jsmm8.cc23_iucr_codes(iucr_code),
            iucr_secondary_desc varchar(100),
            FOREIGN KEY (iucr_code) REFERENCES jsmm8.cc23_iucr_codes(iucr_code)
        );
        '''
    )

    cursor.execute("SELECT COLUMN_NAME FROM information_schema.COLUMNS WHERE TABLE_NAME ='cc23_iucr_codes_secondary_descriptions' and TABLE_SCHEMA=\'%s\';" % user)
    results = cursor.fetchall()

print("Columns in table:")
for row in results:
    print(row)

Columns in table:
('iucr_code',)
('iucr_secondary_desc',)


In [12]:
with connection, connection.cursor() as cursor:

    cursor.execute(
        '''
        CREATE TABLE jsmm8.cc23_fbi_nibrs_categories (
            fbi_nibrs_category_name varchar(50) PRIMARY KEY
        );
        '''
    )

    cursor.execute("SELECT COLUMN_NAME FROM information_schema.COLUMNS WHERE TABLE_NAME ='cc23_fbi_nibrs_categories' and TABLE_SCHEMA=\'%s\';" % user)
    results = cursor.fetchall()

print("Columns in table:")
for row in results:
    print(row)

Columns in table:
('fbi_nibrs_category_name',)


In [13]:
with connection, connection.cursor() as cursor:

    cursor.execute(
        '''
       CREATE TABLE jsmm8.cc23_nibrs_crimes_against (
            nibrs_crime_against varchar(20) PRIMARY KEY
        );
        '''
    )

    cursor.execute("SELECT COLUMN_NAME FROM information_schema.COLUMNS WHERE TABLE_NAME ='cc23_nibrs_crimes_against' and TABLE_SCHEMA=\'%s\';" % user)
    results = cursor.fetchall()

print("Columns in table:")
for row in results:
    print(row)

Columns in table:
('nibrs_crime_against',)


In [14]:
with connection, connection.cursor() as cursor:

    cursor.execute(
        '''
        CREATE TABLE jsmm8.cc23_fbi_nibrs_offense_categories (
            nibrs_offense_code varchar(10) PRIMARY KEY,
            fbi_nibrs_category_name varchar(50) REFERENCES jsmm8.cc23_fbi_nibrs_categories(fbi_nibrs_category_name)
        );
        '''
    )

    cursor.execute("SELECT COLUMN_NAME FROM information_schema.COLUMNS WHERE TABLE_NAME ='cc23_fbi_nibrs_offense_categories' and TABLE_SCHEMA=\'%s\';" % user)
    results = cursor.fetchall()

print("Columns in table:")
for row in results:
    print(row)

Columns in table:
('nibrs_offense_code',)
('fbi_nibrs_category_name',)


In [15]:
with connection, connection.cursor() as cursor:

    cursor.execute(
        '''
        CREATE TABLE jsmm8.cc23_nibrs_fbicode_offenses (
            nibrs_offense_code varchar(10) UNIQUE NOT NULL PRIMARY KEY,
            nibrs_offense_name varchar(100) NOT NULL,
            FOREIGN KEY (nibrs_offense_code) REFERENCES jsmm8.cc23_fbi_nibrs_offense_categories(nibrs_offense_code)
        );
        '''
    )

    cursor.execute("SELECT COLUMN_NAME FROM information_schema.COLUMNS WHERE TABLE_NAME ='cc23_nibrs_fbicode_offenses' and TABLE_SCHEMA=\'%s\';" % user)
    results = cursor.fetchall()

print("Columns in table:")
for row in results:
    print(row)

Columns in table:
('nibrs_offense_code',)
('nibrs_offense_name',)


In [16]:
with connection, connection.cursor() as cursor:

    cursor.execute(
        '''
        CREATE TABLE jsmm8.cc23_nibrs_offenses_crimes_aginst (
            nibrs_crime_against varchar(20),
            nibrs_offense_code varchar(10),
            PRIMARY KEY (nibrs_crime_against, nibrs_offense_code),
            FOREIGN KEY (nibrs_crime_against) REFERENCES jsmm8.cc23_nibrs_crimes_against(nibrs_crime_against),
            FOREIGN KEY (nibrs_offense_code) REFERENCES jsmm8.cc23_nibrs_fbicode_offenses(nibrs_offense_code)
        );
        '''
    )

    cursor.execute("SELECT COLUMN_NAME FROM information_schema.COLUMNS WHERE TABLE_NAME = 'cc23_nibrs_offenses_crimes_aginst' AND TABLE_SCHEMA = 'jsmm8';")
    results = cursor.fetchall()

print("Columns in table:")
for row in results:
    print(row[0])

Columns in table:
nibrs_crime_against
nibrs_offense_code


In [17]:
with connection, connection.cursor() as cursor:

    cursor.execute(
        '''
        CREATE TABLE jsmm8.cc23_case_location (
            case_number varchar(20) PRIMARY KEY,
            block varchar(100),
            location_description varchar(100),
            community_area integer,
            ward integer,
            district integer,
            beat integer,
            latitude real,
            longitude real
        );
        '''
    )

    cursor.execute("SELECT COLUMN_NAME FROM information_schema.COLUMNS WHERE TABLE_NAME ='cc23_case_location' and TABLE_SCHEMA=\'%s\';" % user)
    results = cursor.fetchall()

print("Columns in table:")
for row in results:
    print(row)

Columns in table:
('case_number',)
('block',)
('location_description',)
('community_area',)
('ward',)
('district',)
('beat',)
('latitude',)
('longitude',)


In [18]:
with connection, connection.cursor() as cursor:

    cursor.execute(
        '''
        CREATE TABLE jsmm8.cc23_cases (
            case_number varchar(20) PRIMARY KEY,
            incident_date timestamp,
            iucr_code varchar(10), 
            nibrs_fbi_offense_code varchar(10),
            arrest boolean,
            domestic boolean,
            updated_on timestamp,
            FOREIGN KEY (iucr_code) REFERENCES jsmm8.cc23_iucr_codes(iucr_code),
            FOREIGN KEY (nibrs_fbi_offense_code) REFERENCES jsmm8.cc23_nibrs_fbicode_offenses(nibrs_offense_code),
            FOREIGN KEY (case_number) REFERENCES jsmm8.cc23_case_location(case_number)
        );
        '''
    )

    cursor.execute("SELECT COLUMN_NAME FROM information_schema.COLUMNS WHERE TABLE_NAME ='cc23_cases' and TABLE_SCHEMA=\'%s\';" % user)
    results = cursor.fetchall()

print("Columns in table:")
for row in results:
    print(row)

Columns in table:
('case_number',)
('incident_date',)
('iucr_code',)
('nibrs_fbi_offense_code',)
('arrest',)
('domestic',)
('updated_on',)


## Construct and embed your Entity Relationship Diagram for your destination cc23_ postgress database

Upload your ERD image to the "final_project" folder and update the markdown below to display it here:

![ERD-HERE](ERD.png)


# Perform the ETL of the source data to your SSO dsa_student Chicago Crime Database

* Establish a connection to the the SQLite source database using sqlAlchemy (best choice) - use identifiable name.
* Peform ETL of the source data tables to the destination data tables incrementally (best choice) - use identifiable name.
  * You may want to use pandas as the medium to ETL between the two databases -- **be patient!**
     * it can easliy read "big" source sql table data
     * hold data in a resizable data frame relative to computing resource constraints
     * make any necessary transformations to data values
     * write/load data to destination postgresql tables
    

In [19]:
import getpass
import pandas as pd
from sqlalchemy import create_engine

password = getpass.getpass()
username = 'jsmm8'
host = 'pgsql.dsa.lan'
database = 'dsa_student'

conn_string = f"postgresql://{username}:{password}@{host}/{database}"
sqlite_db_url = "sqlite:////dsa/data/DSA-7030/cc23_7030.sqlite.db"
postgres_engine = create_engine(conn_string)
sqlite_engine = create_engine(sqlite_db_url)

········


In [20]:
query = "SELECT * FROM cc23_iucr_codes"
for iucr_codes in pd.read_sql_query(query, sqlite_engine, chunksize=1000):
    iucr_codes.to_sql('cc23_iucr_codes', postgres_engine, if_exists='append', index=False)
    iucr_codes = iucr_codes.head(5)
print(iucr_codes)

  iucr_code iucr_index_code
0      0110               I
1      0130               I
2      0141               N
3      0142               N
4      0261               I


In [21]:
query = "SELECT * FROM cc23_iucr_codes_primary_descriptions"
for primary_descriptions in pd.read_sql_query(query, sqlite_engine, chunksize=1000):
    primary_descriptions.to_sql('cc23_iucr_codes_primary_descriptions', postgres_engine, if_exists='append', index=False)
    primary_descriptions = primary_descriptions.head(5)
print(primary_descriptions)

  iucr_code    iucr_primary_desc
0      0110             HOMICIDE
1      0130             HOMICIDE
2      0141             HOMICIDE
3      0142             HOMICIDE
4      0261  CRIM SEXUAL ASSAULT


In [22]:
query = "SELECT * FROM cc23_iucr_codes_secondary_descriptions"
for secondary_descriptions in pd.read_sql_query(query, sqlite_engine, chunksize=1000):
    secondary_descriptions.to_sql('cc23_iucr_codes_secondary_descriptions', postgres_engine, if_exists='append', index=False)
    secondary_descriptions = secondary_descriptions.head(5)
print(secondary_descriptions)

  iucr_code       iucr_secondary_desc
0      0110       FIRST DEGREE MURDER
1      0130      SECOND DEGREE MURDER
2      0141  INVOLUNTARY MANSLAUGHTER
3      0142         RECKLESS HOMICIDE
4      0261       AGGRAVATED: HANDGUN


In [23]:
query = "SELECT * FROM cc23_fbi_nibrs_categories"
for fbi_nibrs_categories in pd.read_sql_query(query, sqlite_engine, chunksize=1000):
    fbi_nibrs_categories.to_sql('cc23_fbi_nibrs_categories', postgres_engine, if_exists='append', index=False)
    fbi_nibrs_categories = fbi_nibrs_categories.head(5)
print(fbi_nibrs_categories)

  fbi_nibrs_category_name
0        Assault Offenses
1  Larceny/Theft Offenses
2          Other Offenses
3          Animal Cruelty
4                   Arson


In [24]:
query = "SELECT * FROM cc23_fbi_nibrs_offense_categories"
for fbi_nibrs_offense_categories in pd.read_sql_query(query, sqlite_engine, chunksize=1000):
    fbi_nibrs_offense_categories.to_sql('cc23_fbi_nibrs_offense_categories', postgres_engine, if_exists='append', index=False)
    fbi_nibrs_offense_categories = fbi_nibrs_offense_categories.head(5)
print(fbi_nibrs_offense_categories)

  nibrs_offense_code fbi_nibrs_category_name
0                13A        Assault Offenses
1                04A        Assault Offenses
2                04B        Assault Offenses
3                23H  Larceny/Theft Offenses
4                90Z          Other Offenses


In [25]:
query = "SELECT * FROM cc23_nibrs_crimes_against"
for nibrs_crimes_against in pd.read_sql_query(query, sqlite_engine, chunksize=1000):
    nibrs_crimes_against.to_sql('cc23_nibrs_crimes_against', postgres_engine, if_exists='append', index=False)
    nibrs_crimes_against = nibrs_crimes_against.head(5)
print(nibrs_crimes_against)

  nibrs_crime_against
0         Not a Crime
1              Person
2            Property
3             Society


In [26]:
query = "SELECT * FROM cc23_nibrs_fbicode_offenses"
for nibrs_fbicode_offenses in pd.read_sql_query(query, sqlite_engine, chunksize=1000):
    nibrs_fbicode_offenses.to_sql('cc23_nibrs_fbicode_offenses', postgres_engine, if_exists='append', index=False)
    nibrs_fbicode_offenses = nibrs_fbicode_offenses.head(5)
print(nibrs_fbicode_offenses)

  nibrs_offense_code   nibrs_offense_name
0                13A   Aggravated Assault
1                04A  Aggravated Assault 
2                04B  Aggravated Battery 
3                23H    All Other Larceny
4                90Z   All Other Offenses


In [27]:
query = "SELECT * FROM cc23_nibrs_offenses_crimes_aginst"
for nibrs_offenses_crimes_aginst in pd.read_sql_query(query, sqlite_engine, chunksize=1000):
    nibrs_offenses_crimes_aginst.to_sql('cc23_nibrs_offenses_crimes_aginst', postgres_engine, if_exists='append', index=False)
    nibrs_offenses_crimes_aginst = nibrs_offenses_crimes_aginst.head(5)
print(nibrs_offenses_crimes_aginst)

  nibrs_crime_against nibrs_offense_code
0              Person                13A
1              Person                04A
2              Person                04B
3            Property                23H
4             Society                90Z


In [28]:
query = "SELECT * FROM cc23_case_location"
for case_location in pd.read_sql_query(query, sqlite_engine, chunksize=25000):
    case_location.to_sql('cc23_case_location', postgres_engine, if_exists='append', index=False)
    case_location = case_location.head(5)
print(case_location)

  case_number                                block location_description  \
0    JG401694  059XX S DR MARTIN LUTHER KING JR DR               STREET   
1    JG334529                 007XX S MICHIGAN AVE               STREET   
2    JG266039                 055XX N CAMPBELL AVE            APARTMENT   
3    JG332490                  001XX E SUPERIOR ST        HOTEL / MOTEL   
4    JG247154                  075XX N WESTERN AVE    CONVENIENCE STORE   

   community_area  ward  district  beat   latitude  longitude  
0              40  20.0         2   232  41.786671 -87.615783  
1              32   4.0         1   123  41.872841 -87.624194  
2               4  40.0        20  2011  41.982085 -87.691834  
3               8   2.0        18  1833  41.895751 -87.623496  
4               2  50.0        24  2411  42.018245 -87.690186  


In [29]:
query = "SELECT * FROM cc23_cases"
for cases in pd.read_sql_query(query, sqlite_engine, chunksize=25000):
    cases['arrest'] = cases['arrest'].astype(bool)
    cases['domestic'] = cases['domestic'].astype(bool)
    table_name = "cc23_cases"
    cases.to_sql(table_name, postgres_engine, if_exists="append", index=False)
    cases = cases.head(5)
print(cases)

  case_number           incident_date iucr_code nibrs_fbi_offense_code  \
0    JG401694  08/28/2023 07:03:00 PM      502R                     26   
1    JG334529  07/09/2023 04:55:00 AM      033A                     03   
2    JG266039  05/18/2023 09:35:00 PM      0486                    08B   
3    JG332490  06/04/2023 09:00:00 PM      0820                     06   
4    JG247154  05/03/2023 10:51:00 PM      0312                     03   

   arrest  domestic              updated_on  
0   False     False  09/05/2023 03:41:42 PM  
1   False     False  08/19/2023 03:40:26 PM  
2    True      True  08/19/2023 03:40:26 PM  
3   False     False  08/19/2023 03:40:26 PM  
4   False     False  08/19/2023 03:40:26 PM  


# Execute SQL DML commands (using %sql magic or sqlAlchmey) to confirm the table record counts for the destination database tables are consistent with the source database table record counts

In [30]:
# Confirm counts here
with sqlite_engine.connect() as connection:
    result = connection.execute("SELECT COUNT(*) FROM cc23_iucr_codes;")
    count_sqlite_cc23_iucr_codes = result.scalar()
print("SQLite:", count_sqlite_cc23_iucr_codes)

SQLite: 420


In [31]:
with postgres_engine.connect() as connection:
    result = connection.execute("SELECT COUNT(*) FROM cc23_iucr_codes;")
    count_postgres_cc23_iucr_codes = result.scalar()
print("PostgreSQL:", count_postgres_cc23_iucr_codes)

PostgreSQL: 420


In [32]:
with sqlite_engine.connect() as connection:
    result = connection.execute("SELECT COUNT(*) FROM cc23_iucr_codes_primary_descriptions;")
    count_sqlite_cc23_iucr_codes_primary_descriptions = result.scalar()
print("SQLite:", count_sqlite_cc23_iucr_codes_primary_descriptions)

SQLite: 420


In [33]:
with postgres_engine.connect() as connection:
    result = connection.execute("SELECT COUNT(*) FROM cc23_iucr_codes_primary_descriptions;")
    count_postgres_cc23_iucr_codes_primary_descriptions = result.scalar()
print("PostgreSQL:", count_postgres_cc23_iucr_codes_primary_descriptions)

PostgreSQL: 420


In [34]:
with sqlite_engine.connect() as connection:
    result = connection.execute("SELECT COUNT(*) FROM cc23_iucr_codes_secondary_descriptions;")
    count_sqlite_cc23_iucr_codes_secondary_descriptions = result.scalar()
print("SQLite:", count_sqlite_cc23_iucr_codes_secondary_descriptions)

SQLite: 420


In [35]:
with postgres_engine.connect() as connection:
    result = connection.execute("SELECT COUNT(*) FROM cc23_iucr_codes_secondary_descriptions;")
    count_postgres_cc23_iucr_codes_secondary_descriptions = result.scalar()
print("PostgreSQL:", count_postgres_cc23_iucr_codes_secondary_descriptions)

PostgreSQL: 420


In [36]:
with sqlite_engine.connect() as connection:
    result = connection.execute("SELECT COUNT(*) FROM cc23_fbi_nibrs_categories;")
    count_sqlite_cc23_fbi_nibrs_categories = result.scalar()
print("SQLite:", count_sqlite_cc23_fbi_nibrs_categories)

SQLite: 36


In [37]:
with postgres_engine.connect() as connection:
    result = connection.execute("SELECT COUNT(*) FROM cc23_fbi_nibrs_categories;")
    count_postgres_cc23_fbi_nibrs_categories = result.scalar()
print("PostgreSQL:", count_postgres_cc23_fbi_nibrs_categories)

PostgreSQL: 36


In [38]:
with sqlite_engine.connect() as connection:
    result = connection.execute("SELECT COUNT(*) FROM cc23_fbi_nibrs_offense_categories;")
    count_sqlite_cc23_fbi_nibrs_offense_categories = result.scalar()
print("SQLite:", count_sqlite_cc23_fbi_nibrs_offense_categories)

SQLite: 91


In [39]:
with postgres_engine.connect() as connection:
    result = connection.execute("SELECT COUNT(*) FROM cc23_fbi_nibrs_offense_categories;")
    count_postgres_cc23_fbi_nibrs_offense_categories = result.scalar()
print("PostgreSQL:", count_postgres_cc23_fbi_nibrs_offense_categories)

PostgreSQL: 91


In [40]:
with sqlite_engine.connect() as connection:
    result = connection.execute("SELECT COUNT(*) FROM cc23_nibrs_crimes_against;")
    count_sqlite_cc23_nibrs_crimes_against = result.scalar()
print("SQLite:", count_sqlite_cc23_nibrs_crimes_against)

SQLite: 4


In [41]:
with postgres_engine.connect() as connection:
    result = connection.execute("SELECT COUNT(*) FROM cc23_nibrs_crimes_against;")
    count_postgres_cc23_nibrs_crimes_against = result.scalar()
print("PostgreSQL:", count_postgres_cc23_nibrs_crimes_against)

PostgreSQL: 4


In [42]:
with sqlite_engine.connect() as connection:
    result = connection.execute("SELECT COUNT(*) FROM cc23_nibrs_fbicode_offenses;")
    count_sqlite_cc23_nibrs_fbicode_offenses = result.scalar()
print("SQLite:", count_sqlite_cc23_nibrs_fbicode_offenses)

SQLite: 91


In [43]:
with postgres_engine.connect() as connection:
    result = connection.execute("SELECT COUNT(*) FROM cc23_nibrs_fbicode_offenses;")
    count_postgres_cc23_nibrs_fbicode_offenses = result.scalar()
print("PostgreSQL:", count_postgres_cc23_nibrs_fbicode_offenses)

PostgreSQL: 91


In [44]:
with sqlite_engine.connect() as connection:
    result = connection.execute("SELECT COUNT(*) FROM cc23_nibrs_offenses_crimes_aginst;")
    count_sqlite_cc23_nibrs_offenses_crimes_aginst = result.scalar()
print("SQLite:", count_sqlite_cc23_nibrs_offenses_crimes_aginst)

SQLite: 90


In [45]:
with postgres_engine.connect() as connection:
    result = connection.execute("SELECT COUNT(*) FROM cc23_nibrs_offenses_crimes_aginst;")
    count_postgres_cc23_nibrs_offenses_crimes_aginst = result.scalar()
print("PostgreSQL:", count_postgres_cc23_nibrs_offenses_crimes_aginst)

PostgreSQL: 90


In [46]:
with sqlite_engine.connect() as connection:
    result = connection.execute("SELECT COUNT(*) FROM cc23_case_location;")
    count_sqlite_cc23_case_location = result.scalar()
print("SQLite:", count_sqlite_cc23_case_location)

SQLite: 7932599


In [47]:
with postgres_engine.connect() as connection:
    result = connection.execute("SELECT COUNT(*) FROM cc23_case_location;")
    count_postgres_cc23_case_location = result.scalar()
print("PostgreSQL:", count_postgres_cc23_case_location)

PostgreSQL: 7932599


In [48]:
with sqlite_engine.connect() as connection:
    result = connection.execute("SELECT COUNT(*) FROM cc23_cases;")
    count_sqlite_cc23_cases = result.scalar()
print("SQLite:", count_sqlite_cc23_cases)

SQLite: 7932599


In [49]:
with postgres_engine.connect() as connection:
    result = connection.execute("SELECT COUNT(*) FROM cc23_cases;")
    count_postgres_cc23_cases = result.scalar()
print("PostgreSQL:", count_postgres_cc23_cases)

PostgreSQL: 7932599


## This is the end of Part 1 of the Final Project 
### Part 2 will be deployed in Module 8.

# Save your notebook, then `File > Close and Halt`