# Intro to SQLAlchemy
Remembering syntax and database schemas is hard and complicated. SQLAlchemy is a Python SQL toolkit and Object-Relational Mapping (ORM) library that abstracts away the complexity of SQL queries and allows you to interact with databases using Python objects. This notebook will introduce you to the basics of SQLAlchemy and how to interact with databases using Python.

In this section, we will re-work the previous examples using SQLAlchemy. We will use the same database schema and queries as before, but this time we will use SQLAlchemy to interact with the database.

# Defining a Schema
The first step in using SQLAlchemy is to define a schema for the database. This is done by creating a class for each table in the database. Each class should inherit from the `Base` class provided by SQLAlchemy and define the table name and columns using the `__tablename__` and `Column` attributes.

```from sqlalchemy import Column, Integer, String, Date, Float, create_engine
from sqlalchemy.orm import declarative_base

Base = declarative_base()

class Employee(Base):
    __tablename__ = 'employee_data'

    id = Column(Integer, primary_key=True)
    first_name = Column(String)
    last_name = Column(String)
    start_date = Column(Date)
    termination_date = Column(Date)
    salary = Column(Float)
    department = Column(String)```

# Connecting to a Database
Once the schema is defined, you can connect to a database using the `create_engine` function provided by SQLAlchemy. The `create_engine` function takes a connection string as an argument and returns an engine object that can be used to interact with the database. You must also specify the type of database you are connecting to (e.g., SQLite, MySQL, PostgreSQL) in the connection string. The engine object will handle the connection to the database and execute SQL queries in the same way the conn and cursor did with SQLite3

```engine = create_engine('sqlite:///employees2.db')```

# Creating Tables
To create the tables in the database, you can use the `Base.metadata.create_all` method and pass in the engine object. If the table already exists, this method will not create it again.

```Base.metadata.create_all(engine)```

In [2]:
# Create a new database called "employees2.db" that includes a table called "employees" with the same columns and types as the original "employees" table.




## Inserting Data
This SQLAlchemy class not only defines the schema of the database, but can also detect and transform the data types of the columns of the data passed into the class. 

### Read the data from the CSV file
```import pandas as pd
data = pd.read_csv('./sample_data/employees.csv')```

### Convert the rows of the DataFrame to a list of dictionaries
```data = data.to_dict(orient='records')```

### Create a list of Employee objects
```employees = [Employee(**row) for row in data]```

As long as the column names in the dataframe match the column names in the class, the data will be mapped to the Employee object correctly.

With this, you can individually see the values of the columns of the Employee object by calling the object's attributes.

```print(employees[0].first_name)```
This should print "Alice"

In [11]:
# Read in employees.csv and convert into a list of Employee objects. Ensure the list of objects is called employees.




type(employees[0].salary)

str

If you notice, the salary still isn't converted to a float. You can either clean the dataframe before mapping or you can define a validation in the class itself.

```from sqlalchemy import Column, Integer, String, Date, Float, create_engine
from sqlalchemy.orm import declarative_base, validates

Base = declarative_base()

class Employee(Base):
    __tablename__ = 'employee_data'

    id = Column(Integer, primary_key=True)
    first_name = Column(String)
    last_name = Column(String)
    start_date = Column(Date)
    termination_date = Column(Date)
    salary = Column(Float)
    department = Column(String)

    @validates('salary')
    def validate_salary(self, key, value):
        if isinstance(value, str):
            # Remove dollar sign and commas, then convert to float
            value = value.replace('$', '').replace(',', '')
        return float(value) if value else None

engine = create_engine('sqlite:///employees2.db')
Base.metadata.create_all(engine)```


In [None]:
# Ammend the Employee class to clean the problematic columns. This includes the start_date, end_date, and salary columns.Once done, re-map the employees list.



type(employees[0].salary) #if done correctly, this should print float

# Interacting with the Database
Once the tables are created and the data is mapped to the objects, you can interact with the database using the `Session` object provided by SQLAlchemy. The `Session` object is used to manage transactions with the database and provides methods for querying, inserting, updating, and deleting data.

## Inserting Data
```from sqlalchemy.orm import sessionmaker

with sessionmaker(bind=engine) as session:
    session.add_all(employees)
    session.commit()```

This will add all the Employee objects to the database and commit the transaction. You can also query the database using the `query` method of the `Session` object.


## Querying Data
```with sessionmaker(bind=engine) as session:
    employees = session.query(Employee).all()
    for employee in employees:
        print(employee.first_name, employee.last_name)```

This will print the first and last names of all the employees in the database.

To have a query with a condition, you can use the `filter` method of the `query` object.

```with sessionmaker(bind=engine) as session:
    employees = session.query(Employee).filter(Employee.department == 'HR').all()
    for employee in employees:
        print(employee.first_name, employee.last_name)```

## Updating Data
To update a record, you can query the record, update the attributes, and commit the transaction.

```with sessionmaker(bind=engine) as session:
    employee = session.query(Employee).filter(Employee.first_name == 'Alice').first()
    employee.salary = 60000
    session.commit()```

## Deleting Data
To delete a record, you can query the record, call the `delete` method, and commit the transaction.

```with sessionmaker(bind=engine) as session:
    employee = session.query(Employee).filter(Employee.first_name == 'Alice').first()
    session.delete(employee)
    session.commit()```

# Handling Duplicates
Just like in the previous example, duplicate primary keys are going to cause issues. If you use the "session.add_all" method, it will throw an error if there are duplicates. You can handle this in a similar way by first checking if the row exists and updating.

```with sessionmaker(bind=engine) as session:
    for employee in employees:
        # Capitalized Employee is the Table, lowercase is the row object
        employee = session.query(Employee).filter(Employee.id == employee.id).first()
        if employee:
            Employee.first_name = employee.first_name
            Employee.last_name = employee.last_name
            Employee.start_date = employee.start_date
            Employee.termination_date = employee.termination_date
            Employee.salary = employee.salary
            Employee.department = employee.department
            # this will update the row in the database
        else:
            session.add(employee)
    session.commit()```

In [None]:
# read employees.csv and add the data to the database employees2.db






In [None]:
# Update the database employees2.db with the employees2.csv file




In [None]:
# Return a list of employees who make more than $70,000. There should be 10





# Return a list of employees that work in the HR department. There should be 4




# Return a list of employees who were hired in 2020. There should be 3




# Return a list of all active employees that do not have a termination date. There should be 9




