<a href="https://colab.research.google.com/github/omyahro/Data_200_Royals/blob/main/ORoyals_Commit6.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Building Databases with SQL in Python
We will begin with a csv file, recall this is a collection of data with each value separated by commas. The data can be many rows long and each row is a **record**. Each column in the data represents a **field** in a database. The underlying structure for a database is present in the csv or xlsx file.

## Importing the Libraries
To create a database in Python, we will introduce a new library, SQLite3. As this is your first use of this library, we need to use the `pip install sqlite-database` method to install the library and then import it for use. We will use a Python dependency library, csv, to handle the I/O tools for comma-separated value files.

SQLite is a C library that provides a lightweight disk-based database that doesn’t require a separate server process and allows accessing the database using a nonstandard variant of the SQL query language. https://www.sqlite.org/about.html

In [1]:
pip install sqlite-database

Collecting sqlite-database
  Downloading sqlite_database-0.7.2-py3-none-any.whl.metadata (4.2 kB)
Downloading sqlite_database-0.7.2-py3-none-any.whl (36 kB)
Installing collected packages: sqlite-database
Successfully installed sqlite-database-0.7.2


In [2]:
# Import the libraries
import sqlite3
import csv
import numpy as np
import pandas as pd

## Establishing a Connection
The first step in database creation is to establish a connection to the data and the cursor that will allow you to navigate the database and run commands. When connecting to a database, you can enter the name of an existing database or designate a filename to create a new database.

Syntax: `sqlite3.connect(database, timeout=5.0,
detect_types=0, isolation_level='DEFERRED',
check_same_thread=True, factory=sqlite3.Connection,
cached_statements=128, uri=False, *,
autocommit=sqlite3.LEGACY_TRANSACTION_CONTROL)`

We simplify the full syntax to include the name of the database. The additional parameters and a detailed explanation on when to use them is located here: https://docs.python.org/3/library/sqlite3.html#sqlite3-reference

In [3]:
# Create a connection
# Syntax: conn = sqlite3.connect('databaseName.sqlite')
conn = sqlite3.connect('energyindicators.sqlite')

# Create a cursor object to navigate
cur = conn.cursor()

# SQL Basics

## Creating Tables
The database is comprised of tables that add structure and store the data. The next step is to create a table that holds the data we would like to use from the CSV file. The data in the CSV may have a first row that designates the names of the fields, or it may consist of just the data without column headers. The field names are defined using a `CREATE TABLE` method to synchronize the fields with the column names.

When creating the table fields, the field's data type must be designated. In Python, we have string; in SQL, we have TEXT.

The following Python types can thus be sent to SQLite without any problem:
<table>
  <thead>
    <tr>
      <th>Python type</th>
      <th>SQLite type</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>None</td>
      <td>NULL</td>
    </tr>
    <tr>
      <td>int</td>
      <td>INTEGER</td>
    </tr>
    <tr>
      <td>float</td>
      <td>REAL</td>
    </tr>
    <tr>
      <td>str</td>
      <td>TEXT</td>
    </tr>
    <tr>
      <td>bytes</td>
      <td>BLOB</td>
    </tr>
  </tbody>
</table>

You can also create a dictionary in Python and pass this to the `CREATE TABLE` method to include the field names for the table.

Example:<br>

       CREATE TABLE students(
       name TEXT,
       age INTEGER,
       grade INTEGER
       );

In [18]:
# Create a cursor to interact with the database.
cur.execute('DROP TABLE BlackAthletes')

<sqlite3.Cursor at 0x7fae7a8ec040>

In [19]:
# Create a new table named students
# Table fields will be name, graduation year, and school
cur.execute('''
    CREATE TABLE BlackAthletes (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        name TEXT,
        sport TEXT,
        country TEXT,
        achievements TEXT
    )
''')

conn.commit()


In [20]:
# Let's insert a record into the table
cur.execute('''
    INSERT INTO BlackAthletes (name, sport, country, achievements)
    VALUES ('Michael Jordan', 'Basketball', 'USA', '6× NBA champion, global icon, Nike Air Jordan legacy')
''')
conn.commit()

In [21]:
# Let's import more than one data into the table
# Start with a list for the data
athletes = [("Serena Williams", "Tennis", "USA", "23 Grand Slam titles, advocate for women in sport"),
    ("LeBron James", "Basketball", "USA", "4× NBA champion, philanthropist, I PROMISE School founder"),
    ("Simone Biles", "Gymnastics", "USA", "Most decorated gymnast in world history"),
    ("Usain Bolt", "Track & Field", "Jamaica", "8× Olympic gold medalist, world record holder"),
    ("Wilma Rudolph", "Track & Field", "USA", "First American woman to win 3 gold medals in one Olympics"),
    ("Muhammad Ali", "Boxing", "USA", "3× world heavyweight champion, civil rights activist"),
    ("Naomi Osaka", "Tennis", "Japan", "4× Grand Slam titles, mental health advocate"),
    ("Pelé", "Soccer", "Brazil", "3× World Cup winner, global icon of the sport"),
    ("Allyson Felix", "Track & Field", "USA", "Most decorated female U.S. Olympian in track"),
    ("Jackie Robinson", "Baseball", "USA", "Broke MLB color barrier in 1947")
]

# Now let's add the data as new records.
cur.executemany('''
    INSERT INTO BlackAthletes (name, sport, country, achievements)
    VALUES (?, ?, ?, ?)
''', athletes)

conn.commit()