~~~
filename = 'huck_finn.txt'

file = open(filename, mode='r') # 'r' is to read

text = file.read()

file.close()

print(text)
~~~

#### Context Manager

~~~
with open('huck_finn.txt','r') as file:
	print(file.read())
~~~

**Flat files**: basic text files containing records. That is, table data.

In [4]:
import pandas as pd

dic = {'Name': ['Ana', 'Bob', 'Carol'], 'Age': [13, 28, 55]}
data = pd.DataFrame(dic)

print(data.to_numpy())

[['Ana' 13]
 ['Bob' 28]
 ['Carol' 55]]


**Pickled files**:

~~~
import pickle

with open('pickled_fruit.pkl','rb') as file:
	data = pickle.load(file)

print(data)
~~~

**Excel files**:

~~~
import pandas as pd

file = 'urbanpop.xlsx'

data = pd.ExcelFile(file)

print(data.sheet_names)

df1 = data.parse('1960-1966')
df2 = data.parse(0)
~~~

Listing the working directory:

~~~
import os
wd = os.getcwd()
os.listdir(wd)
~~~

#### SAS and SATA files

~~~
import pandas as pd

# SAS files

from sas7bdat import SAS7BDAT

with SAS7BDAT('urbanpop.sas7bdat') as file:
	df_sas = file.to_data_frame()

# Stata files

data = pd.read_stata('urbanpop.dta')
~~~

#### HDF5 files

Standard for storing large quantities of numerical data.

Can scale to exabytes.

~~~
import h5py

filename = 'H-H1_LOSC_4_V1-815411200-4096.hdf5'

data = h5py.File(filename,'r')

print(type(data))

for key in data.keys():
	print(key) # meta, quality, strain

~~~

#### MATLAB files

~~~
scipy.io.loadmat() # read .mat files

scipy.io.savemat() # write .mat files
~~~

~~~
import scipy.io

filename = 'workspace.mat'

mat = scipy.io.loadmat(filename)

print(type(mat)) # dict!

# keys: MATLAB var names
# values: objects assigned to vars
~~~

### Intro to relational DBs

Using SQLAlchemy:

~~~
from sqlalchemy import import create_engine

engine = create_engine('sqlite:///Northwind.sqlite')

table_names = engine.table_names()
~~~

**Workflow of SQL querying**:

- Import packages and functions
- Create the database engine
- Connect to the engine
- Query the database
- Save query results to a DataFrame
- Close the connection

Connecting...

~~~
from sqlalchemy import create_engine

import pandas as pd

engine = create_engine('sqlite:///Northwind.sqlite')

con = engine.connect()

rs = con.execute('SELECT * FROM Orders')

df = pd.DataFrame(rs.fecthall())

df.columns = rs.keys()

con.close()
~~~

Using context manager:

~~~
from sqlalchemy import create_engine

import pandas as pd

engine = create_engine('sqlite:///Northwind.sqlite')

with engine.connect() as con:
    rs = con.execute('SELECT OrderID, OrderDate, ShipName FROM Orders')

    df = pd.DataFrame(rs.fecthmany(size=5))

    df.columns = rs.keys()

~~~

**The Pandas Way**:

~~~
from sqlalchemy import create_engine

import pandas as pd

engine = create_engine('sqlite:///Northwind.sqlite')

df = pd.read_sql_query('SELECT * FROM Orders',engine)
~~~

**INNER JOIN in Python**:
~~~
df = pd.read_sql_query('SELECT OrderId, CompanyName FROM Orders INNER JOIN Customers on Orders.CustomerId = Customers.CustomersId',engine)
~~~