<a href="https://colab.research.google.com/github/omyahro/Data_200_Royals/blob/main/ORoyals_Commit7.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Building Databases with SQL in Python
We will begin with a csv file, recall this is a collection of data with each value separated by commas. The data can be many rows long and each row is a **record**. Each column in the data represents a **field** in a database. The underlying structure for a database is present in the csv or xlsx file.

## Importing the Libraries
To create a database in Python, we will introduce a new library, SQLite3. As this is your first use of this library, we need to use the `pip install sqlite-database` method to install the library and then import it for use. We will use a Python dependency library, csv, to handle the I/O tools for comma-separated value files.

SQLite is a C library that provides a lightweight disk-based database that doesn’t require a separate server process and allows accessing the database using a nonstandard variant of the SQL query language. https://www.sqlite.org/about.html

In [4]:
pip install sqlite-database



In [5]:
# Import the libraries
import sqlite3
import csv
import numpy as np
import pandas as pd

## Establishing a Connection
The first step in database creation is to establish a connection to the data and the cursor that will allow you to navigate the database and run commands. When connecting to a database, you can enter the name of an existing database or designate a filename to create a new database.

Syntax: `sqlite3.connect(database, timeout=5.0,
detect_types=0, isolation_level='DEFERRED',
check_same_thread=True, factory=sqlite3.Connection,
cached_statements=128, uri=False, *,
autocommit=sqlite3.LEGACY_TRANSACTION_CONTROL)`

We simplify the full syntax to include the name of the database. The additional parameters and a detailed explanation on when to use them is located here: https://docs.python.org/3/library/sqlite3.html#sqlite3-reference

In [6]:
# Create a connection
# Syntax: conn = sqlite3.connect('databaseName.sqlite')
conn = sqlite3.connect('energyindicators.sqlite')

# Create a cursor object to navigate
cur = conn.cursor()

# SQL Basics

## Creating Tables
The database is comprised of tables that add structure and store the data. The next step is to create a table that holds the data we would like to use from the CSV file. The data in the CSV may have a first row that designates the names of the fields, or it may consist of just the data without column headers. The field names are defined using a `CREATE TABLE` method to synchronize the fields with the column names.

When creating the table fields, the field's data type must be designated. In Python, we have string; in SQL, we have TEXT.

The following Python types can thus be sent to SQLite without any problem:
<table>
  <thead>
    <tr>
      <th>Python type</th>
      <th>SQLite type</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>None</td>
      <td>NULL</td>
    </tr>
    <tr>
      <td>int</td>
      <td>INTEGER</td>
    </tr>
    <tr>
      <td>float</td>
      <td>REAL</td>
    </tr>
    <tr>
      <td>str</td>
      <td>TEXT</td>
    </tr>
    <tr>
      <td>bytes</td>
      <td>BLOB</td>
    </tr>
  </tbody>
</table>

You can also create a dictionary in Python and pass this to the `CREATE TABLE` method to include the field names for the table.

Example:<br>

       CREATE TABLE students(
       name TEXT,
       age INTEGER,
       grade INTEGER
       );

In [19]:
# Create a cursor to interact with the database.
cur.execute('DROP TABLE BlackAthletes')

<sqlite3.Cursor at 0x791a1958c9c0>

In [20]:
# Create a new table named students
# Table fields will be name, graduation year, and school
cur.execute('''
    CREATE TABLE BlackAthletes (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        name TEXT,
        sport TEXT,
        country TEXT,
        olympic_medals INTEGER
    )
''')

conn.commit()


In [23]:
# Let's insert a record into the table
cur.execute('''
    INSERT INTO BlackAthletes (name, sport, country, olympic_medals)
    VALUES ('Michael Jordan', 'Basketball', 'USA', 2)
''')
conn.commit()

In [27]:
# Let's import more than one data into the table
# Start with a list for the data
data = [
    ("Serena Williams", "Tennis", "USA", 4),
    ("LeBron James", "Basketball", "USA", 2),
    ("Simone Biles", "Gymnastics", "USA", 7),
    ("Usain Bolt", "Track & Field", "Jamaica", 8),
    ("Wilma Rudolph", "Track & Field", "USA", 3),
    ("Muhammad Ali", "Boxing", "USA", 1),
    ("Naomi Osaka", "Tennis", "Japan", 0),
    ("Pelé", "Soccer", "Brazil", 0),  # World Cups, not Olympics
    ("Allyson Felix", "Track & Field", "USA", 11),
    ("Jackie Robinson", "Baseball", "USA", 0)
]

# Insert data into the table
cur.executemany('''
    INSERT INTO BlackAthletes (name, sport, country, olympic_medals)
    VALUES (?, ?, ?, ?)
''', data)

conn.commit()

In [28]:
cur.execute('''
SELECT *
FROM BlackAthletes;''')

cur.fetchall()

[(1, 'Michael Jordan', 'Basketball', 'USA', 2),
 (2, 'Serena Williams', 'Tennis', 'USA', 4),
 (3, 'LeBron James', 'Basketball', 'USA', 2),
 (4, 'Simone Biles', 'Gymnastics', 'USA', 7),
 (5, 'Usain Bolt', 'Track & Field', 'Jamaica', 8),
 (6, 'Wilma Rudolph', 'Track & Field', 'USA', 3),
 (7, 'Muhammad Ali', 'Boxing', 'USA', 1),
 (8, 'Naomi Osaka', 'Tennis', 'Japan', 0),
 (9, 'Pelé', 'Soccer', 'Brazil', 0),
 (10, 'Allyson Felix', 'Track & Field', 'USA', 11),
 (11, 'Jackie Robinson', 'Baseball', 'USA', 0),
 (12, 'Serena Williams', 'Tennis', 'USA', 4),
 (13, 'LeBron James', 'Basketball', 'USA', 2),
 (14, 'Simone Biles', 'Gymnastics', 'USA', 7),
 (15, 'Usain Bolt', 'Track & Field', 'Jamaica', 8),
 (16, 'Wilma Rudolph', 'Track & Field', 'USA', 3),
 (17, 'Muhammad Ali', 'Boxing', 'USA', 1),
 (18, 'Naomi Osaka', 'Tennis', 'Japan', 0),
 (19, 'Pelé', 'Soccer', 'Brazil', 0),
 (20, 'Allyson Felix', 'Track & Field', 'USA', 11),
 (21, 'Jackie Robinson', 'Baseball', 'USA', 0),
 (22, 'Serena Williams',

In [31]:
cur.execute('''

SELECT DISTINCT name, sport, country, olympic_medals
FROM BlackAthletes;''')

cur.fetchall()


[('Michael Jordan', 'Basketball', 'USA', 2),
 ('Serena Williams', 'Tennis', 'USA', 4),
 ('LeBron James', 'Basketball', 'USA', 2),
 ('Simone Biles', 'Gymnastics', 'USA', 7),
 ('Usain Bolt', 'Track & Field', 'Jamaica', 8),
 ('Wilma Rudolph', 'Track & Field', 'USA', 3),
 ('Muhammad Ali', 'Boxing', 'USA', 1),
 ('Naomi Osaka', 'Tennis', 'Japan', 0),
 ('Pelé', 'Soccer', 'Brazil', 0),
 ('Allyson Felix', 'Track & Field', 'USA', 11),
 ('Jackie Robinson', 'Baseball', 'USA', 0)]

In [35]:
cur.execute('''
    DELETE FROM BlackAthletes
    WHERE olympic_medals = 0
''')

cur.execute('''
SELECT *
FROM BlackAthletes;
''')


cur.fetchall()

[(1, 'Michael Jordan', 'Basketball', 'USA', 2),
 (2, 'Serena Williams', 'Tennis', 'USA', 4),
 (3, 'LeBron James', 'Basketball', 'USA', 2),
 (4, 'Simone Biles', 'Gymnastics', 'USA', 7),
 (5, 'Usain Bolt', 'Track & Field', 'Jamaica', 8),
 (6, 'Wilma Rudolph', 'Track & Field', 'USA', 3),
 (7, 'Muhammad Ali', 'Boxing', 'USA', 1),
 (10, 'Allyson Felix', 'Track & Field', 'USA', 11),
 (12, 'Serena Williams', 'Tennis', 'USA', 4),
 (13, 'LeBron James', 'Basketball', 'USA', 2),
 (14, 'Simone Biles', 'Gymnastics', 'USA', 7),
 (15, 'Usain Bolt', 'Track & Field', 'Jamaica', 8),
 (16, 'Wilma Rudolph', 'Track & Field', 'USA', 3),
 (17, 'Muhammad Ali', 'Boxing', 'USA', 1),
 (20, 'Allyson Felix', 'Track & Field', 'USA', 11),
 (22, 'Serena Williams', 'Tennis', 'USA', 4),
 (23, 'LeBron James', 'Basketball', 'USA', 2),
 (24, 'Simone Biles', 'Gymnastics', 'USA', 7),
 (25, 'Usain Bolt', 'Track & Field', 'Jamaica', 8),
 (26, 'Wilma Rudolph', 'Track & Field', 'USA', 3),
 (27, 'Muhammad Ali', 'Boxing', 'USA', 

In [32]:
cur.execute('''
SELECT *
    FROM BlackAthletes
    WHERE sport = "NULL";
''')

cur.fetchall()

[]

In [37]:
cur.execute('''
    SELECT * FROM BlackAthletes
    WHERE country = 'USA' AND olympic_medals >= 5
''')

cur.fetchall()

[(4, 'Simone Biles', 'Gymnastics', 'USA', 7),
 (10, 'Allyson Felix', 'Track & Field', 'USA', 11),
 (14, 'Simone Biles', 'Gymnastics', 'USA', 7),
 (20, 'Allyson Felix', 'Track & Field', 'USA', 11),
 (24, 'Simone Biles', 'Gymnastics', 'USA', 7),
 (30, 'Allyson Felix', 'Track & Field', 'USA', 11)]

In [29]:
cur.execute('''
SELECT *
    FROM BlackAthletes
    WHERE olympic_medals BETWEEN 2 AND 6;
''')

cur.fetchall()

[(1, 'Michael Jordan', 'Basketball', 'USA', 2),
 (2, 'Serena Williams', 'Tennis', 'USA', 4),
 (3, 'LeBron James', 'Basketball', 'USA', 2),
 (6, 'Wilma Rudolph', 'Track & Field', 'USA', 3),
 (12, 'Serena Williams', 'Tennis', 'USA', 4),
 (13, 'LeBron James', 'Basketball', 'USA', 2),
 (16, 'Wilma Rudolph', 'Track & Field', 'USA', 3),
 (22, 'Serena Williams', 'Tennis', 'USA', 4),
 (23, 'LeBron James', 'Basketball', 'USA', 2),
 (26, 'Wilma Rudolph', 'Track & Field', 'USA', 3)]