## Summary notes

Write records stored in a local CSV file to a SQL database.

Sources:

- See [pandas.DataFrame.to_sql](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_sql.html) (*Pandas*) for full signature
- [How can I create an in-memory database with sqlite?](https://stackoverflow.com/questions/38676691/how-can-i-create-an-in-memory-database-with-sqlite) (*StackOverflow*)

### History

- 2022-10-08
   - Notebook initialised
- 2022-10-09
   - Swapped *sqlite3* $\to$ *sqlalchemy*
   - Removed the function, added instead as example script

## Dependencies

In [1]:
import pandas as pd
import sqlalchemy as sql

## Main

Create new trival local CSV file in path, and create an in-memory SQLite database.

In [2]:
(pd.DataFrame()
 .assign(
     pk=[1, 2, 3],
     user=['user1', 'user2', 'user3'],
     join_date=['2001-01-01', '2002-02-02', '2003-03-03']
 )
 .to_csv('__cache\\club_users.csv')
)
engine = sql.create_engine('sqlite://', echo=False)

Write the CSV file to a table name *user*.

In [3]:
(pd.read_csv('__cache\\club_users.csv')
 .to_sql('user', engine, if_exists='replace', index=False, chunksize=10000)
)

3

Confirm the new table's schema and content.

In [4]:
pd.read_sql('SELECT * FROM user', engine).info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 4 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   Unnamed: 0  3 non-null      int64 
 1   pk          3 non-null      int64 
 2   user        3 non-null      object
 3   join_date   3 non-null      object
dtypes: int64(2), object(2)
memory usage: 224.0+ bytes


In [5]:
pd.read_sql('SELECT * FROM user', engine)

Unnamed: 0.1,Unnamed: 0,pk,user,join_date
0,0,1,user1,2001-01-01
1,1,2,user2,2002-02-02
2,2,3,user3,2003-03-03


In [6]:
%load_ext watermark
%watermark --iv

sqlalchemy: 1.4.41
pandas    : 1.5.0

