# Creating tables with SQLAlchemy

 To create a new table? You'd still use the Table object; however, you'd need to replace the autoload and autoload_with parameters with Column objects.

The Column object takes a name, a SQLAlchemy type with an optional format, and optional keyword arguments for different constraints.

When defining the table, recall how in the video Jason passed in 255 as the maximum length of a String by using Column('name', String(255)). Checking out the slides from the video may help.

After defining the table, you can create the table in the database by using the .create_all() method on metadata and supplying the engine as the only parameter. Go for it!

Instructions
* Import Table, Column, String, Integer, Float, Boolean from sqlalchemy.
*  Build a new table called data with columns 'name' (String(255)), 'count' (Integer()), 'amount'(Float()), and 'valid' (Boolean()) columns.
*   The second argument of Table() needs to be metadata, which has already been initialized.
*    Create the table in the database by passing engine to metadata.create_all().

In [8]:
from sqlalchemy import create_engine, MetaData, Table, Column, String, Integer, Float, Boolean

# Create an engine (using SQLite in-memory database as an example)
engine = create_engine('sqlite:///:memory:', echo=True)

# Create a MetaData instance
metadata = MetaData()

# Define the data table
data = Table('data', metadata,
             Column('name', String(255)),
             Column('count', Integer()),
             Column('amount', Float()),
             Column('valid', Boolean())
)

# Create the table in the database
metadata.create_all(engine)

# Print table representation
print(repr(data))

2025-04-10 16:11:01,014 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-04-10 16:11:01,015 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("data")
2025-04-10 16:11:01,015 INFO sqlalchemy.engine.Engine [raw sql] ()
2025-04-10 16:11:01,016 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info("data")
2025-04-10 16:11:01,017 INFO sqlalchemy.engine.Engine [raw sql] ()
2025-04-10 16:11:01,018 INFO sqlalchemy.engine.Engine 
CREATE TABLE data (
	name VARCHAR(255), 
	count INTEGER, 
	amount FLOAT, 
	valid BOOLEAN
)


2025-04-10 16:11:01,018 INFO sqlalchemy.engine.Engine [no key 0.00045s] ()
2025-04-10 16:11:01,020 INFO sqlalchemy.engine.Engine COMMIT
Table('data', MetaData(), Column('name', String(length=255), table=<data>), Column('count', Integer(), table=<data>), Column('amount', Float(), table=<data>), Column('valid', Boolean(), table=<data>), schema=None)


## Constraints and data defaults

You're now going to practice creating a table with some constraints! Often, you'll need to make sure that a column is unique, nullable, a positive value, or related to a column in another table. This is where constraints come in.

As Jason showed you in the video, in addition to constraints, you can also set a default value for the column if no data is passed to it via the default keyword on the column.

Instructions
* Table, Column, String, Integer, Float, Boolean are already imported from sqlalchemy.
* Build a new table called data with a unique name (String), count (Integer) defaulted to 1, amount (Float), and valid (Boolean) defaulted to False.
* Submit the answer to create the table in the database and to print the table details for data.

In [13]:
from sqlalchemy import create_engine, MetaData, Table, Column, String, Integer, Float, Boolean

# Create an engine (using SQLite in-memory database as an example)
engine = create_engine('sqlite:///:memory:', echo=True)

# Create a MetaData instance
metadata = MetaData()

# Define a new table with a name, count, amount, and valid column: data
data = Table('data', metadata,
             Column('name', String(255), unique=True),
             Column('count', Integer(), default=1),
             Column('amount', Float()),
             Column('valid', Boolean(), default=False)
)

# Use the metadata to create the table
metadata.create_all(engine)

# Print the table details
print(repr(metadata.tables['data']))

2025-04-10 16:13:37,024 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-04-10 16:13:37,024 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("data")
2025-04-10 16:13:37,024 INFO sqlalchemy.engine.Engine [raw sql] ()
2025-04-10 16:13:37,024 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info("data")
2025-04-10 16:13:37,024 INFO sqlalchemy.engine.Engine [raw sql] ()
2025-04-10 16:13:37,031 INFO sqlalchemy.engine.Engine 
CREATE TABLE data (
	name VARCHAR(255), 
	count INTEGER, 
	amount FLOAT, 
	valid BOOLEAN, 
	UNIQUE (name)
)


2025-04-10 16:13:37,031 INFO sqlalchemy.engine.Engine [no key 0.00056s] ()
2025-04-10 16:13:37,033 INFO sqlalchemy.engine.Engine COMMIT
Table('data', MetaData(), Column('name', String(length=255), table=<data>), Column('count', Integer(), table=<data>, default=ScalarElementColumnDefault(1)), Column('amount', Float(), table=<data>), Column('valid', Boolean(), table=<data>, default=ScalarElementColumnDefault(False)), schema=None)


## Inserting a single row

There are several ways to perform an insert with SQLAlchemy; however, we are going to focus on the one that follows the same pattern as the select statement.

It uses an insert statement where you specify the table as an argument, and supply the data you wish to insert into the value via the .values() method as keyword arguments. For example, if my_table contains columns my_col_1 and my_col_2, then insert(my_table).values(my_col_1=5, my_col_2="Example") will create a row in my_table with the value in my_col_1 equal to 5 and value in my_col_2 equal to "Example".

Notice the difference in syntax: when appending a where statement to an existing statement, we include the name of the table as well as the name of the column, for example new_stmt = old_stmt.where(my_tbl.columns.my_col == 15). This is necessary because the existing statement might involve several tables.

On the other hand, you can only insert a record into a single table, so you do not need to include the name of the table when using values() to insert, e.g. stmt = insert(my_table).values(my_col = 10).

Here, the name of the table is data. You can run repr(data) in the console to examine the structure of the table.

Instructions
* Import insert and select from the sqlalchemy module.
* Build an insert statement insert_stmt for the data table to set name to 'Anna', count to 1, amount to 1000.00, and valid to True.
* Execute insert_stmt with the connection and store the results.
* Print the .rowcount attribute of results to see how many records were inserted.
* Build a select statement to query data for the record with the name of 'Anna'.
* Run the solution to print the results of executing the select statement.

In [21]:
from sqlalchemy import create_engine, MetaData, Table, Column, String, Integer, Float, Boolean, insert, select

# Create an engine (using SQLite in-memory database as an example)
engine = create_engine('sqlite:///:memory:', echo=True)

# Create a MetaData instance
metadata = MetaData()

# Define a new table with a name, count, amount, and valid column: data
data = Table('data', metadata,
             Column('name', String(255), unique=True),
             Column('count', Integer(), default=1),
             Column('amount', Float()),
             Column('valid', Boolean(), default=False)
)

# Use the metadata to create the table
metadata.create_all(engine)

# Print the table details
print(repr(metadata.tables['data']))

# Build an insert statement to insert a record into the data table
insert_stmt = insert(data).values(name='Anna', count=1, amount=1000.00, valid=True)

# Create a connection on engine
connection = engine.connect()

# Execute the insert statement via the connection
results = connection.execute(insert_stmt)

# Print result rowcount
print(results.rowcount)

# Build a select statement to validate the insert
select_stmt = select(data).where(data.c.name == 'Anna')

# Print the result of executing the query
print(connection.execute(select_stmt).first())

# Close the connection
connection.close()

2025-04-10 16:17:07,653 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-04-10 16:17:07,654 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("data")
2025-04-10 16:17:07,654 INFO sqlalchemy.engine.Engine [raw sql] ()
2025-04-10 16:17:07,655 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info("data")
2025-04-10 16:17:07,655 INFO sqlalchemy.engine.Engine [raw sql] ()
2025-04-10 16:17:07,657 INFO sqlalchemy.engine.Engine 
CREATE TABLE data (
	name VARCHAR(255), 
	count INTEGER, 
	amount FLOAT, 
	valid BOOLEAN, 
	UNIQUE (name)
)


2025-04-10 16:17:07,657 INFO sqlalchemy.engine.Engine [no key 0.00041s] ()
2025-04-10 16:17:07,658 INFO sqlalchemy.engine.Engine COMMIT
Table('data', MetaData(), Column('name', String(length=255), table=<data>), Column('count', Integer(), table=<data>, default=ScalarElementColumnDefault(1)), Column('amount', Float(), table=<data>), Column('valid', Boolean(), table=<data>, default=ScalarElementColumnDefault(False)), schema=None)
2025-04-10 16:17:07,659 INFO

## Inserting multiple records at once

It's time to practice inserting multiple records at once!

As Jason showed you in the video, when inserting multiple records at once, you do not use the .values() method. Instead, you'll want to first build a list of dictionaries that represents the data you want to insert, with keys being the names of the columns. in the .execute() method, you can pair this list of dictionaries with an insert statement, which will insert all the records in your list of dictionaries.

Instructions
* Build a list of dictionaries called values_list with two dictionaries. In the first dictionary set name to 'Anna', count to 1, amount to 1000.00, and valid to True. In the second dictionary of the list, set name to 'Taylor', count to 1, amount to 750.00, and valid to False.
* Build an insert statement for the data table for a multiple insert, save it as stmt.
* Execute stmt with the values_list via connection and store the results. Make sure values_list is the second argument to .execute().
* Print the rowcount of the results.

In [24]:
from sqlalchemy import create_engine, MetaData, Table, Column, String, Integer, Float, Boolean, insert, select

# Create an engine (using SQLite in-memory database as an example)
engine = create_engine('sqlite:///:memory:', echo=True)

# Create a MetaData instance
metadata = MetaData()

# Define a new table with a name, count, amount, and valid column: data
data = Table('data', metadata,
             Column('name', String(255), unique=True),
             Column('count', Integer(), default=1),
             Column('amount', Float()),
             Column('valid', Boolean(), default=False)
)

# Use the metadata to create the table
metadata.create_all(engine)

# Create a connection on engine
connection = engine.connect()

# Build a list of dictionaries: values_list
values_list = [
    {'name': 'Anna', 'count': 1, 'amount': 1000.00, 'valid': True},
    {'name': 'Taylor', 'count': 1, 'amount': 750.00, 'valid': False}
]

# Build an insert statement for the data table: stmt
stmt = insert(data)

# Execute stmt with the values_list: results
results = connection.execute(stmt, values_list)

# Print rowcount
print(results.rowcount)

# Verify the insert with a select statement
select_stmt = select(data)
print("Inserted records:")
for row in connection.execute(select_stmt).fetchall():
    print(row)

# Close the connection
connection.close()

2025-04-10 16:19:18,077 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-04-10 16:19:18,077 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("data")
2025-04-10 16:19:18,078 INFO sqlalchemy.engine.Engine [raw sql] ()
2025-04-10 16:19:18,079 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info("data")
2025-04-10 16:19:18,080 INFO sqlalchemy.engine.Engine [raw sql] ()
2025-04-10 16:19:18,082 INFO sqlalchemy.engine.Engine 
CREATE TABLE data (
	name VARCHAR(255), 
	count INTEGER, 
	amount FLOAT, 
	valid BOOLEAN, 
	UNIQUE (name)
)


2025-04-10 16:19:18,082 INFO sqlalchemy.engine.Engine [no key 0.00070s] ()
2025-04-10 16:19:18,083 INFO sqlalchemy.engine.Engine COMMIT
2025-04-10 16:19:18,085 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-04-10 16:19:18,086 INFO sqlalchemy.engine.Engine INSERT INTO data (name, count, amount, valid) VALUES (?, ?, ?, ?)
2025-04-10 16:19:18,086 INFO sqlalchemy.engine.Engine [generated in 0.00145s] [('Anna', 1, 1000.0, 1), ('Taylor', 1, 750.0, 0)]
2
Ins

## Loading a CSV into a table

You've done a great job so far at inserting data into tables! You're now going to learn how to load the contents of a CSV file into a table.

One way to do that would be to read a CSV file line by line, create a dictionary from each line, and then use insert(), like you did in the previous exercise.

But there is a faster way using pandas. You can read a CSV file into a DataFrame using the read_csv() function (this function should be familiar to you, but you can run help(pd.read_csv) in the console to refresh your memory!). Then, you can call the .to_sql() (docs) method on the DataFrame to load it into a SQL table in a database. The columns of the DataFrame should match the columns of the SQL table.

.to_sql() has many parameters, but in this exercise we will use the following:

name is the name of the SQL table (as a string).
con is the connection to the database that you will use to upload the data.
if_exists specifies how to behave if the table already exists in the database; possible values are "fail", "replace", and "append".
index (True or False) specifies whether to write the DataFrame's index as a column.
In this exercise, you will upload the data contained in the census.csv file into an existing table "census". The connection to the database has already been created for you.

Instructions
* Use pd.read_csv() to load the "census.csv" file into a DataFrame. Set the header parameter to None since the file doesn't have a header row.
* Rename the columns of census_df to "state", "sex", age, "pop2000", and "pop2008" to match the columns of the "census" table in the database.

In [38]:
# import pandas
import pandas as pd

# Define the path to the local SQLite database file (corrected for Windows)
data_path = r'C:\Users\kanha\Census Data Explorer\data\census.csv'

# read census.csv into a DataFrame : census_df
census_df = pd.read_csv(f'{data_path}', header=None)

census_df.head()


Unnamed: 0,0,1,2,3,4
0,Illinois,M,0,89600,95012
1,Illinois,M,1,88445,91829
2,Illinois,M,2,88729,89547
3,Illinois,M,3,88868,90037
4,Illinois,M,4,91947,91111


In [40]:
# rename the columns of the census DataFrame
census_df.columns = ['state', 'sex', 'age', 'pop2000', 'pop2008']

# Create a connection on engine
connection = engine.connect()

# append the data from census_df to the "census" table via connection
census_df.to_sql(name='census', con=connection, if_exists='append', index=False)

2025-04-10 16:41:54,085 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-04-10 16:41:54,085 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("census")
2025-04-10 16:41:54,085 INFO sqlalchemy.engine.Engine [raw sql] ()
2025-04-10 16:41:54,117 INFO sqlalchemy.engine.Engine INSERT INTO census (state, sex, age, pop2000, pop2008) VALUES (?, ?, ?, ?, ?)
2025-04-10 16:41:54,117 INFO sqlalchemy.engine.Engine [generated in 0.02344s] [('Illinois', 'M', 0, 89600, 95012), ('Illinois', 'M', 1, 88445, 91829), ('Illinois', 'M', 2, 88729, 89547), ('Illinois', 'M', 3, 88868, 90037), ('Illinois', 'M', 4, 91947, 91111), ('Illinois', 'M', 5, 93894, 89802), ('Illinois', 'M', 6, 93676, 88931), ('Illinois', 'M', 7, 94818, 90940)  ... displaying 10 of 8772 total bound parameter sets ...  ('Texas', 'F', 84, 27961, 36821), ('Texas', 'F', 85, 171538, 223439)]
2025-04-10 16:41:54,131 INFO sqlalchemy.engine.Engine COMMIT


8772

## Updating individual records

The update statement is very similar to an insert statement. For example, you can update all wages in the employees table as follows:

stmt = update(employees).values(wage=100.00)
The update statement also typically uses a where clause to help us determine what data to update. For example, to only update the record for the employee with ID 15, you would append the previous statement as follows:

stmt = stmt.where(employees.id == 15)
You'll be using the FIPS state code here, which is appropriated by the U.S. government to identify U.S. states and certain other associated areas.

For your convenience, the names of the tables and columns of interest in this exercise are: state_fact (Table), name (Column), and fips_state (Column).

Instructions
Notice that there is only one record in state_fact for the state of New York. It currently has the FIPS code of 0.

* Build an update statement to change the fips_state column code to 36, save it as update_stmt.
* Use a where clause to filter for states with the name of 'New York' in the state_fact table.
* Execute update_stmt via the connection and save the output as update_results.

In [43]:
from sqlalchemy import create_engine, MetaData, Table, Column, String, Integer, Float, Boolean, insert, select
import pandas as pd

# Create an engine (using SQLite in-memory database as an example)
engine = create_engine('sqlite:///:memory:', echo=True)

# Create a MetaData instance
metadata = MetaData()

# Define the state_fact table (assuming structure based on context)
state_fact = Table('state_fact', metadata,
                   Column('name', String(255)),
                   Column('fips_state', String(2)),  # FIPS codes are typically 2 characters
                   # Add other columns as needed
                  )

# Create the table in the database
metadata.create_all(engine)

# Create a connection
connection = engine.connect()

# Example: Populate state_fact with some data (for demonstration)
state_data = [
    {'name': 'New York', 'fips_state': '36'},
    {'name': 'California', 'fips_state': '06'}
]
connection.execute(insert(state_fact), state_data)

# Read census.csv into a DataFrame (assuming file exists)
# For this example, we'll create a dummy DataFrame since we don't have the actual file
census_df = pd.DataFrame({
    'state': ['New York', 'New York'],
    'sex': ['M', 'F'],
    'age': [25, 30],
    'pop2000': [1000000, 1100000],
    'pop2008': [1050000, 1150000]
})

# Rename the columns of the census DataFrame
census_df.columns = ['state', 'sex', 'age', 'pop2000', 'pop2008']

# Append the data from census_df to the "census" table via connection
census_df.to_sql(name='census', con=connection, if_exists='append', index=False)

# Build a select statement: select_stmt
select_stmt = select(state_fact).where(state_fact.c.name == 'New York')

# Execute select_stmt and fetch the results
results = connection.execute(select_stmt).fetchall()

# Print the results of executing the select_stmt
print(results)

# Print the FIPS code for the first row of the result
print(results[0]['fips_state'])

# Close the connection
connection.close()



select_stmt = select([state_fact]).where(state_fact.columns.name == 'New York')
results = connection.execute(select_stmt).fetchall()
print(results)
print(results[0]['fips_state'])

# Build a statement to update the fips_state to 36: update_stmt
update_stmt = update(state_fact).values(fips_state = 36)

# Append a where clause to limit it to records for New York state
update_stmt = update_stmt.where(state_fact.columns.name == 'New York')

# Execute the statement: update_results
update_results = connection.execute(update_stmt)

NameError: name 'state_fact' is not defined