## Extract: Process to pull data from Source system
## Load: Process to write data to a destination system

### Common upstream & downstream systems
- **OLTP Databases: Postgres, MySQL, sqlite3, etc**
- **OLAP Databases: Snowflake, BigQuery, Clickhouse, DuckDB, etc**
- **Cloud data storage: AWS S3, GCP Cloud Store, Minio, etc**
- **Queue systems: Kafka, Redpanda, etc**
- **API**
- **Local disk: csv, excel, json, xml files**
- **SFTP\FTP server**

### Databases
When reading or writing to a database we use a database driver. Database drivers are libraries that we can use to read or write to a database.


- #### Question: How do you read data from a sqlite3 database and write to a DuckDB database?
- Hint: Look at importing the database libraries for sqlite3 and duckdb and create connections to talk to the respective databases

DuckDB is an in-process SQL OLAP database management system that can be used directly within a Python environment. It allows us to efficiently query large datasets without the need for a separate database server.

In [20]:
import sqlite3

# Connect to an SQLite database (or create it if it doesn't exist)
conn = sqlite3.connect('2-entertainment.db')

# Create a cursor object using the cursor() method
cursor = conn.cursor()

# Create table
cursor.execute('''CREATE TABLE IF NOT EXISTS movies
            (title, year, score)''')

# Insert a row of data
data = [
    ('Monty Python and the Holy Grail', 1975, 8.2),
    ('And Now for Something Completely Different', 1971, 7.5), 
    ('Sopranos', 2001, 9.9),
    ('Breaking Bad', 2008, 9.8),
    ('Vikings', 2014, 9)
]
cursor.executemany("INSERT INTO movies VALUES(?, ?, ?)", data)

# Save (commit) the changes
conn.commit()

# Close the connection
conn.close()

In [21]:
conn = sqlite3.connect('2-entertainment.db')
cur = conn.cursor()

results = cur.execute("SELECT * FROM movies")
sqllite_results = results.fetchall()
sqllite_results

[('Monty Python and the Holy Grail', 1975, 8.2),
 ('And Now for Something Completely Different', 1971, 7.5),
 ('Sopranos', 2001, 9.9),
 ('Breaking Bad', 2008, 9.8),
 ('Vikings', 2014, 9)]

In [22]:
import duckdb
con = duckdb.connect(database='2-ent.duckdb', read_only=False)

# Insert data from list
con.executemany("INSERT INTO movies VALUES (?, ?, ?)", sqllite_results)
duckdb_results = con.execute("SELECT * FROM movies").fetchall()
con.close()

duckdb_results

[('Monty Python and the Holy Grail', 1975, 8.199999809265137),
 ('And Now for Something Completely Different', 1971, 7.5),
 ('Sopranos', 2001, 9.899999618530273),
 ('Breaking Bad', 2008, 9.800000190734863),
 ('Vikings', 2014, 9.0),
 ('Monty Python and the Holy Grail', 1975, 8.199999809265137),
 ('And Now for Something Completely Different', 1971, 7.5),
 ('Sopranos', 2001, 9.899999618530273),
 ('Breaking Bad', 2008, 9.800000190734863),
 ('Vikings', 2014, 9.0)]