# Exercise: practice SQLAlchemy 🧪

Let's configure our first RDS instance on AWS and then use SQLAlchemy to write into our remote database and make some SQL queries!

1. Follow the instructions/videos from yesterday (M03-D03/04-Amazon-RDS.ipynb) to create your own RDS instance on AWS 
2. Download <a href="https://www.pgadmin.org/download/" target="_blank">PGAdmin</a> and configure it to access your remote database

If you get stuck at some step, don't hesitate to ask for help to your classmates, your teacher or your TA 🤗.

**Before continuing, please execute the cell below, this will install a package that is required to access your remote database from this notebook:**

In [28]:
# !pip install psycopg2-binary
# ! pip install python-dotenv

3. Create an sqlalchemy engine that is connected to your AWS RDS instance

In [33]:
# Import sqlalchemy
from sqlalchemy import create_engine, text

from dotenv import load_dotenv
import os

load_dotenv()
%load_ext dotenv
%dotenv

DBUSERNAME = os.getenv('DBUSERNAME')
DBPASSWORD = os.getenv('DBPASSWORD')
DBHOSTNAME = os.getenv('DBHOSTNAME')
DBDBNAME = os.getenv('DBNAME')


# Create engine will create a connection between a SQLlite DB and python
# engine = create_engine("sqlite:///:memory:", echo=True)
# engine = create_engine(f"mysql+pymysql://{DBUSER}:{DBPASS}@{DBHOST}:{PORT}/{DBNAME}", echo=True)
# engine = create_engine(f"postgresql+psycopg2://{USERNAME}:{PASSWORD}@{HOSTNAME}/{DBNAME}", echo=True)
engine = create_engine(f"postgresql+psycopg2://{DBUSERNAME}:{DBPASSWORD}@{DBHOSTNAME}/{DBDBNAME}", echo=True)

The dotenv extension is already loaded. To reload it, use:
  %reload_ext dotenv


In [35]:
# Let's instanciate a declarative base to be able to use our python class
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()

# Let's define our table using a class
from sqlalchemy import Column, Integer, String

class User(Base):
    __tablename__ = "customers"

    # Each parameter corresponds to a column in our DB table
    id = Column(Integer, primary_key=True)
    name = Column(String)
    country = Column(String)
    job = Column(String)
    age = Column(Integer)

    def __repr__(self):
        return "<User(name='{}', country='{}', job='{}', age='{}')>".format(self.name, self.country, self.job, self.age)

4. Create a new table named `customers` in your remote database and insert the following data :

| id | name       | country        | job        | age |
|----|------------|----------------|------------|-----|
| 1  | Sauerkraut | Germany        | engineer   | 37  |
| 2  | Jones      | United Kingdom | journalist | 52  |
| 3  | Dupont     | France         | dancer     | 25  |

Optionnal: Use PGAdmin to check that the table has been created without any mistake

In [36]:
Base.metadata.create_all(engine)

2022-04-04 14:57:14,549 INFO sqlalchemy.engine.Engine select version()
2022-04-04 14:57:14,550 INFO sqlalchemy.engine.Engine [raw sql] {}
2022-04-04 14:57:14,719 INFO sqlalchemy.engine.Engine select current_schema()
2022-04-04 14:57:14,720 INFO sqlalchemy.engine.Engine [raw sql] {}
2022-04-04 14:57:14,888 INFO sqlalchemy.engine.Engine show standard_conforming_strings
2022-04-04 14:57:14,889 INFO sqlalchemy.engine.Engine [raw sql] {}
2022-04-04 14:57:15,057 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-04-04 14:57:15,059 INFO sqlalchemy.engine.Engine select relname from pg_class c join pg_namespace n on n.oid=c.relnamespace where pg_catalog.pg_table_is_visible(c.oid) and relname=%(name)s
2022-04-04 14:57:15,060 INFO sqlalchemy.engine.Engine [generated in 0.00092s] {'name': 'customers'}
2022-04-04 14:57:15,228 INFO sqlalchemy.engine.Engine COMMIT


In [None]:
# Create instances of User

# Create a new instance of User will allow us to insert a new record later on
ed_user = User(id=1, name='Berenger Queune', country='France', job='Data Engineer', age=68)

# Access Full row 
print(ed_user)

# Access ed_user name 
name = ed_user.name
print("name: {}".format(name))

# Access ed_user nickname
nickname = ed_user.job 
print("nickname: {}".format(nickname))

# Initialize a sessionmaker 
from sqlalchemy.orm import sessionmaker

Session = sessionmaker(bind=engine)

# Instanciate Session 

session = Session()

# Add values to db 

al_user = User(id=2, name='Hajime No Ippo', country='Japon', job='Boxeur', age=20)

session.add(ed_user)
session.add(al_user)
# Commit the results 

session.commit()

<User(name='Berenger Queune', country='France', job='Data Engineer', age='68')>
name: Berenger Queune
nickname: Data Engineer
2022-04-04 14:33:55,343 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-04-04 14:33:55,347 INFO sqlalchemy.engine.Engine INSERT INTO customers (id, name, country, job, age) VALUES (?, ?, ?, ?, ?)
2022-04-04 14:33:55,349 INFO sqlalchemy.engine.Engine [generated in 0.00137s] ((1, 'Berenger Queune', 'France', 'Data Engineer', 68), (2, 'Hajime No Ippo', 'Japon', 'Boxeur', 20))
2022-04-04 14:33:55,350 INFO sqlalchemy.engine.Engine COMMIT


In [None]:
# Query our table users
user = session.query(User)

# Output all the results 
user.all()

2022-04-04 14:33:55,434 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-04-04 14:33:55,436 INFO sqlalchemy.engine.Engine SELECT customers.id AS customers_id, customers.name AS customers_name, customers.country AS customers_country, customers.job AS customers_job, customers.age AS customers_age 
FROM customers
2022-04-04 14:33:55,437 INFO sqlalchemy.engine.Engine [generated in 0.00094s] ()


[<User(name='Berenger Queune', country='France', job='Data Engineer', age='68')>,
 <User(name='Hajime No Ippo', country='Japon', job='Boxeur', age='20')>]

5. Execute the cell below to download the famous iris dataset:

6. Create a table in your remote database containing the information of the dataset:

### Now let's make some SQL requests !

To answer the following questions, don't hesitate to refer to <a href="https://www.sqltutorial.org/sql-cheat-sheet/" target="_blank">this cheatsheet</a>. 😉

7. What are the different species present in this dataset?

8. What is the average sepal length among all species?

9. What is the average sepal length for each species?

10. How many samples of 'virginica' have sepal length < 6?

11. For each species, count the number of samples having sepal length < 6: