# [2] CSV to SQL Part 1 - Lending Club Data

---
I have sourced my loan data from Lending Club which is the largest peer-to-peer lending organisation in the world. I downloaded 12 csv files from Lending Club, covering loans from 2007 to 2017. The size of the data is well over 1GB. Therefore, my first steps will be to push the data into an SQL database. This will significantly speed up the process of loading the data into pandas and also allow me to use SQL queries.

Once I have stored my data in SQL, I will then perform exploratory data analysis to clean and analyse the data. Following that, I will create a classification model to classify if a loan will default or be fully paid, with the predicted values being 1 and 0, respectively.

---

In [2]:
import pandas as pd

In [3]:
# Create SQL alchemy engine to create a connection with my local PostgreSQL database.
from sqlalchemy import create_engine
engine = create_engine('postgresql://postgres:database@localhost:5432/Capstone')

---
I have downloaded 12 csv files from Lending Club, covering 10 years worth of data. All columns are the same. I will load each file into pandas and then push it into a local PostgreSQL database, where each file will be used to create a table. The advantage of using SQL is that it will massively increase read times over reading in from csv and also allow the use of SQL queries. Once all 12 files are in SQL, I can concatonate them to form one large table with all records.

---

## 2007-2011

---
The first file covers loans from the years 2007 to 2011. I will load it into pandas before pushing it into SQL.

---

In [3]:
file1 = './Datasets/Lending Club/lendclub_07-11.csv'   # Path of file.

In [4]:
df1 = pd.read_csv(file1, skiprows=1, skipfooter=2)
# Skip first row and last two rows because they are not columns and prevent normal loading into pandas.

  """Entry point for launching an IPython kernel.


In [5]:
df1.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,5000.0,5000.0,4975.0,36 months,10.65%,162.87,B,B2,...,,,Cash,N,,,,,,
1,,,2500.0,2500.0,2500.0,60 months,15.27%,59.83,C,C4,...,,,Cash,N,,,,,,
2,,,2400.0,2400.0,2400.0,36 months,15.96%,84.33,C,C5,...,,,Cash,N,,,,,,
3,,,10000.0,10000.0,10000.0,36 months,13.49%,339.31,C,C1,...,,,Cash,N,,,,,,
4,,,3000.0,3000.0,3000.0,60 months,12.69%,67.79,B,B5,...,,,Cash,N,,,,,,


In [6]:
df1.tail()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
42531,,,3500.0,3500.0,225.0,36 months,10.28%,113.39,C,C1,...,,,Cash,N,,,,,,
42532,,,1000.0,1000.0,0.0,36 months,9.64%,32.11,B,B4,...,,,Cash,N,,,,,,
42533,,,2525.0,2525.0,225.0,36 months,9.33%,80.69,B,B3,...,,,Cash,N,,,,,,
42534,,,6500.0,6500.0,0.0,36 months,8.38%,204.84,A,A5,...,,,Cash,N,,,,,,
42535,,,5000.0,5000.0,0.0,36 months,7.75%,156.11,A,A3,...,,,Cash,N,,,,,,


In [7]:
df1.shape

(42536, 145)

In [8]:
df1.to_sql(name='LC_07-11', con=engine, if_exists='replace', index = False)

---
Push the dataframe into an SQL table and then query it to make sure it has worked.

---

In [9]:
SQL_STRING = '''

select * from "LC_07-11"
'''

df = pd.read_sql(SQL_STRING, con=engine)
df.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,5000.0,5000.0,4975.0,36 months,10.65%,162.87,B,B2,...,,,Cash,N,,,,,,
1,,,2500.0,2500.0,2500.0,60 months,15.27%,59.83,C,C4,...,,,Cash,N,,,,,,
2,,,2400.0,2400.0,2400.0,36 months,15.96%,84.33,C,C5,...,,,Cash,N,,,,,,
3,,,10000.0,10000.0,10000.0,36 months,13.49%,339.31,C,C1,...,,,Cash,N,,,,,,
4,,,3000.0,3000.0,3000.0,60 months,12.69%,67.79,B,B5,...,,,Cash,N,,,,,,


In [10]:
df.shape

(42536, 145)

## 2012-2013

---
Now I will repeat this process for the other 11 files.

---

In [11]:
file2 = './Datasets/Lending Club/lendclub_12-13.csv'

In [12]:
df2 = pd.read_csv(file2, skiprows=1, skipfooter=2)

  """Entry point for launching an IPython kernel.


In [13]:
df2.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,14000,14000,14000.0,36 months,12.85%,470.71,B,B4,...,,,Cash,N,,,,,,
1,,,15000,15000,15000.0,36 months,14.47%,516.1,C,C2,...,,,Cash,N,,,,,,
2,,,15000,15000,15000.0,36 months,8.90%,476.3,A,A5,...,,,Cash,N,,,,,,
3,,,10000,10000,10000.0,36 months,9.67%,321.13,B,B1,...,,,Cash,N,,,,,,
4,,,20800,20800,20800.0,36 months,13.53%,706.16,B,B5,...,,,Cash,N,,,,,,


In [14]:
df2.tail()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
188176,,,20500,20500,20500.0,36 months,16.77%,728.54,D,D2,...,,,Cash,N,,,,,,
188177,,,15000,15000,15000.0,36 months,15.27%,521.97,C,C4,...,,,Cash,N,,,,,,
188178,,,35000,35000,35000.0,36 months,15.96%,1229.81,C,C5,...,,,Cash,N,,,,,,
188179,,,12000,12000,12000.0,36 months,16.29%,423.61,D,D1,...,,,Cash,N,,,,,,
188180,,,12000,7775,7775.0,60 months,15.27%,186.08,C,C4,...,,,Cash,N,,,,,,


In [15]:
df2.shape

(188181, 145)

In [16]:
df2.to_sql(name='LC_12-13', con=engine, if_exists='replace', index = False)

In [17]:
SQL_STRING = '''

select * from "LC_12-13"
'''

df = pd.read_sql(SQL_STRING, con=engine)
df.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,14000,14000,14000.0,36 months,12.85%,470.71,B,B4,...,,,Cash,N,,,,,,
1,,,15000,15000,15000.0,36 months,14.47%,516.1,C,C2,...,,,Cash,N,,,,,,
2,,,15000,15000,15000.0,36 months,8.90%,476.3,A,A5,...,,,Cash,N,,,,,,
3,,,10000,10000,10000.0,36 months,9.67%,321.13,B,B1,...,,,Cash,N,,,,,,
4,,,20800,20800,20800.0,36 months,13.53%,706.16,B,B5,...,,,Cash,N,,,,,,


In [18]:
df.shape

(188181, 145)

## 2014

In [19]:
file3 = './Datasets/Lending Club/lendclub_14.csv'

In [20]:
df3 = pd.read_csv(file3, skiprows=1, skipfooter=2)

  """Entry point for launching an IPython kernel.


In [21]:
df3.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,15000,15000,15000,60 months,12.39%,336.64,C,C1,...,,,Cash,N,,,,,,
1,,,10400,10400,10400,36 months,6.99%,321.08,A,A3,...,,,Cash,N,,,,,,
2,,,12800,12800,12800,60 months,17.14%,319.08,D,D4,...,,,Cash,N,,,,,,
3,,,7650,7650,7650,36 months,13.66%,260.2,C,C3,...,,,Cash,N,,,,,,
4,,,21425,21425,21425,60 months,15.59%,516.36,D,D1,...,,,Cash,N,,,,,,


In [22]:
df3.tail()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
235624,,,18400,18400,18400,60 months,14.47%,432.64,C,C2,...,,,Cash,N,,,,,,
235625,,,22000,22000,22000,60 months,19.97%,582.5,D,D5,...,,,Cash,N,,,,,,
235626,,,20700,20700,20700,60 months,16.99%,514.34,D,D1,...,,,Cash,N,,,,,,
235627,,,2000,2000,2000,36 months,7.90%,62.59,A,A4,...,,,Cash,N,,,,,,
235628,,,10000,10000,9975,36 months,19.20%,367.58,D,D3,...,,,Cash,N,,,,,,


In [23]:
df3.shape

(235629, 145)

In [24]:
df3.to_sql(name='LC_14', con=engine, if_exists='replace', index = False)

In [25]:
SQL_STRING = '''

select * from "LC_14"
'''

df = pd.read_sql(SQL_STRING, con=engine)
df.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,15000,15000,15000,60 months,12.39%,336.64,C,C1,...,,,Cash,N,,,,,,
1,,,10400,10400,10400,36 months,6.99%,321.08,A,A3,...,,,Cash,N,,,,,,
2,,,12800,12800,12800,60 months,17.14%,319.08,D,D4,...,,,Cash,N,,,,,,
3,,,7650,7650,7650,36 months,13.66%,260.2,C,C3,...,,,Cash,N,,,,,,
4,,,21425,21425,21425,60 months,15.59%,516.36,D,D1,...,,,Cash,N,,,,,,


In [26]:
df.shape

(235629, 145)

## 2015

In [27]:
file4 = './Datasets/Lending Club/lendclub_15.csv'

In [28]:
df4 = pd.read_csv(file4, skiprows=1, skipfooter=2)

  """Entry point for launching an IPython kernel.


In [29]:
df4.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,9000,9000,9000.0,36 months,8.49%,284.07,B,B1,...,,,Cash,N,,,,,,
1,,,17475,17475,17475.0,36 months,13.99%,597.17,C,C4,...,,,Cash,Y,Jan-2018,ACTIVE,Jan-2018,4382.0,44.99,12.0
2,,,21000,21000,21000.0,60 months,13.99%,488.53,C,C4,...,,,Cash,N,,,,,,
3,,,7200,7200,7200.0,36 months,11.48%,237.36,B,B5,...,,,Cash,N,,,,,,
4,,,27500,27500,27500.0,60 months,14.85%,652.06,C,C5,...,,,Cash,N,,,,,,


In [30]:
df4.tail()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
421090,,,10000,10000,10000.0,36 months,11.99%,332.1,B,B5,...,,,Cash,N,,,,,,
421091,,,24000,24000,24000.0,36 months,11.99%,797.03,B,B5,...,,,Cash,N,,,,,,
421092,,,12000,12000,12000.0,60 months,19.99%,317.86,E,E3,...,,,Cash,N,,,,,,
421093,,,13000,13000,13000.0,60 months,15.99%,316.07,D,D2,...,,,Cash,N,,,,,,
421094,,,20000,20000,20000.0,36 months,11.99%,664.2,B,B5,...,,,Cash,N,,,,,,


In [31]:
df4.shape

(421095, 145)

In [32]:
df4.to_sql(name='LC_15', con=engine, if_exists='replace', index = False)

In [33]:
SQL_STRING = '''

select * from "LC_15"
'''

df = pd.read_sql(SQL_STRING, con=engine)
df.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,9000,9000,9000.0,36 months,8.49%,284.07,B,B1,...,,,Cash,N,,,,,,
1,,,17475,17475,17475.0,36 months,13.99%,597.17,C,C4,...,,,Cash,Y,Jan-2018,ACTIVE,Jan-2018,4382.0,44.99,12.0
2,,,21000,21000,21000.0,60 months,13.99%,488.53,C,C4,...,,,Cash,N,,,,,,
3,,,7200,7200,7200.0,36 months,11.48%,237.36,B,B5,...,,,Cash,N,,,,,,
4,,,27500,27500,27500.0,60 months,14.85%,652.06,C,C5,...,,,Cash,N,,,,,,


In [34]:
df.shape

(421095, 145)

## 2016 Q1

In [35]:
file5 = './Datasets/Lending Club/lendclub_16-q1.csv'

In [36]:
df5 = pd.read_csv(file5, skiprows=1, skipfooter=2)

  """Entry point for launching an IPython kernel.


In [37]:
df5.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,10000,10000,10000.0,60 months,19.53%,262.34,D,D5,...,,,Cash,N,,,,,,
1,,,35000,35000,35000.0,60 months,20.75%,941.96,E,E2,...,,,Cash,N,,,,,,
2,,,20000,20000,20000.0,60 months,9.16%,416.73,B,B2,...,,,Cash,N,,,,,,
3,,,17475,17475,17475.0,60 months,11.47%,384.06,B,B5,...,,,Cash,N,,,,,,
4,,,8000,8000,8000.0,36 months,9.16%,255.0,B,B2,...,,,Cash,N,,,,,,


In [38]:
df5.tail()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
133882,,,6000,6000,6000.0,36 months,7.89%,187.72,A,A5,...,,,Cash,N,,,,,,
133883,,,6000,6000,6000.0,36 months,9.17%,191.28,B,B2,...,,,Cash,N,,,,,,
133884,,,14400,14400,14400.0,60 months,13.18%,328.98,C,C3,...,,,Cash,N,,,,,,
133885,,,34050,34050,34050.0,36 months,15.41%,1187.21,D,D1,...,,,Cash,N,,,,,,
133886,,,5000,5000,5000.0,36 months,11.22%,164.22,B,B5,...,,,Cash,N,,,,,,


In [39]:
df5.shape

(133887, 145)

In [40]:
df5.to_sql(name='LC_16-q1', con=engine, if_exists='replace', index = False)

In [41]:
SQL_STRING = '''

select * from "LC_16-q1"
'''

df = pd.read_sql(SQL_STRING, con=engine)
df.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,10000,10000,10000.0,60 months,19.53%,262.34,D,D5,...,,,Cash,N,,,,,,
1,,,35000,35000,35000.0,60 months,20.75%,941.96,E,E2,...,,,Cash,N,,,,,,
2,,,20000,20000,20000.0,60 months,9.16%,416.73,B,B2,...,,,Cash,N,,,,,,
3,,,17475,17475,17475.0,60 months,11.47%,384.06,B,B5,...,,,Cash,N,,,,,,
4,,,8000,8000,8000.0,36 months,9.16%,255.0,B,B2,...,,,Cash,N,,,,,,


In [42]:
df.shape

(133887, 145)

## 2016 Q2

In [43]:
file6 = './Datasets/Lending Club/lendclub_16-q2.csv'

In [44]:
df6 = pd.read_csv(file6, skiprows=1, skipfooter=2)

  """Entry point for launching an IPython kernel.


In [45]:
df6.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,24000,24000,24000.0,60 months,13.99%,558.32,C,C3,...,,,Cash,N,,,,,,
1,,,16000,16000,16000.0,60 months,15.59%,385.62,C,C5,...,,,Cash,N,,,,,,
2,,,12000,12000,12000.0,60 months,21.49%,327.96,D,D5,...,,,Cash,N,,,,,,
3,,,7500,7500,7500.0,36 months,10.99%,245.51,B,B4,...,,,Cash,N,,,,,,
4,,,21000,21000,21000.0,36 months,13.49%,712.54,C,C2,...,,,Cash,N,,,,,,


In [46]:
df6.tail()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
97849,,,7000,7000,7000.0,36 months,5.32%,210.81,A,A1,...,,,Cash,N,,,,,,
97850,,,21475,21475,21475.0,60 months,17.27%,536.84,D,D2,...,,,Cash,N,,,,,,
97851,,,6050,6050,6050.0,60 months,18.25%,154.46,D,D3,...,,,Cash,N,,,,,,
97852,,,30000,30000,30000.0,60 months,18.99%,778.06,D,D4,...,,,Cash,N,,,,,,
97853,,,5000,5000,5000.0,36 months,10.75%,163.11,B,B4,...,,,Cash,N,,,,,,


In [47]:
df6.shape

(97854, 145)

In [48]:
df6.to_sql(name='LC_16-q2', con=engine, if_exists='replace', index = False)

In [49]:
SQL_STRING = '''

select * from "LC_16-q2"
'''

df = pd.read_sql(SQL_STRING, con=engine)
df.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,24000,24000,24000.0,60 months,13.99%,558.32,C,C3,...,,,Cash,N,,,,,,
1,,,16000,16000,16000.0,60 months,15.59%,385.62,C,C5,...,,,Cash,N,,,,,,
2,,,12000,12000,12000.0,60 months,21.49%,327.96,D,D5,...,,,Cash,N,,,,,,
3,,,7500,7500,7500.0,36 months,10.99%,245.51,B,B4,...,,,Cash,N,,,,,,
4,,,21000,21000,21000.0,36 months,13.49%,712.54,C,C2,...,,,Cash,N,,,,,,


In [50]:
df.shape

(97854, 145)

## 2016 Q3

In [51]:
file7 = './Datasets/Lending Club/lendclub_16-q3.csv'

In [52]:
df7 = pd.read_csv(file7, skiprows=1, skipfooter=2)

  """Entry point for launching an IPython kernel.


In [53]:
df7.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,15000,15000,15000,36 months,13.99%,512.6,C,C3,...,,,Cash,N,,,,,,
1,,,14000,14000,14000,36 months,14.49%,481.83,C,C4,...,,,Cash,N,,,,,,
2,,,7000,7000,7000,36 months,7.59%,218.04,A,A3,...,,,Cash,N,,,,,,
3,,,4200,4200,4200,36 months,8.59%,132.76,A,A5,...,,,Cash,N,,,,,,
4,,,18000,18000,18000,36 months,8.59%,568.97,A,A5,...,,,Cash,N,,,,,,


In [54]:
df7.tail()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
99115,,,2100,2100,2100,36 months,10.75%,68.51,B,B4,...,,,Cash,N,,,,,,
99116,,,5000,5000,5000,36 months,7.39%,155.28,A,A4,...,,,Cash,N,,,,,,
99117,,,32000,32000,32000,60 months,10.49%,687.65,B,B3,...,,,Cash,N,,,,,,
99118,,,1600,1600,1600,36 months,14.49%,55.07,C,C4,...,,,Cash,N,,,,,,
99119,,,34000,34000,34000,60 months,7.89%,687.61,A,A5,...,,,Cash,N,,,,,,


In [55]:
df7.shape

(99120, 145)

In [56]:
df7.to_sql(name='LC_16-q3', con=engine, if_exists='replace', index = False)

In [57]:
SQL_STRING = '''

select * from "LC_16-q3"
'''

df = pd.read_sql(SQL_STRING, con=engine)
df.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,15000,15000,15000,36 months,13.99%,512.6,C,C3,...,,,Cash,N,,,,,,
1,,,14000,14000,14000,36 months,14.49%,481.83,C,C4,...,,,Cash,N,,,,,,
2,,,7000,7000,7000,36 months,7.59%,218.04,A,A3,...,,,Cash,N,,,,,,
3,,,4200,4200,4200,36 months,8.59%,132.76,A,A5,...,,,Cash,N,,,,,,
4,,,18000,18000,18000,36 months,8.59%,568.97,A,A5,...,,,Cash,N,,,,,,


In [58]:
df.shape

(99120, 145)

## 2016 Q4

In [59]:
file8 = './Datasets/Lending Club/lendclub_16-q4.csv'

In [60]:
df8 = pd.read_csv(file8, skiprows=1, skipfooter=2)

  """Entry point for launching an IPython kernel.


In [61]:
df8.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,19200,19200,19200.0,36 months,13.99%,656.12,C,C3,...,,,Cash,N,,,,,,
1,,,11100,11100,11100.0,36 months,5.32%,334.28,A,A1,...,,,Cash,N,,,,,,
2,,,8000,8000,8000.0,36 months,7.99%,250.66,A,A5,...,,,Cash,N,,,,,,
3,,,39000,39000,39000.0,36 months,15.99%,1370.94,C,C5,...,,,Cash,N,,,,,,
4,,,13400,13400,13400.0,36 months,13.49%,454.67,C,C2,...,,,Cash,N,,,,,,


In [62]:
df8.tail()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
103541,,,24000,24000,24000.0,60 months,12.79%,543.5,C,C1,...,,,Cash,N,,,,,,
103542,,,24000,24000,24000.0,60 months,10.49%,515.74,B,B3,...,,,Cash,N,,,,,,
103543,,,40000,40000,40000.0,60 months,10.49%,859.56,B,B3,...,,,Cash,N,,,,,,
103544,,,24000,24000,24000.0,60 months,14.49%,564.56,C,C4,...,,,Cash,N,,,,,,
103545,,,14000,14000,14000.0,60 months,14.49%,329.33,C,C4,...,,,Cash,N,,,,,,


In [63]:
df8.shape

(103546, 145)

In [64]:
df8.to_sql(name='LC_16-q4', con=engine, if_exists='replace', index = False)

In [65]:
SQL_STRING = '''

select * from "LC_16-q4"
'''

df = pd.read_sql(SQL_STRING, con=engine)
df.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,19200,19200,19200.0,36 months,13.99%,656.12,C,C3,...,,,Cash,N,,,,,,
1,,,11100,11100,11100.0,36 months,5.32%,334.28,A,A1,...,,,Cash,N,,,,,,
2,,,8000,8000,8000.0,36 months,7.99%,250.66,A,A5,...,,,Cash,N,,,,,,
3,,,39000,39000,39000.0,36 months,15.99%,1370.94,C,C5,...,,,Cash,N,,,,,,
4,,,13400,13400,13400.0,36 months,13.49%,454.67,C,C2,...,,,Cash,N,,,,,,


In [66]:
df.shape

(103546, 145)

## 2017 Q1

In [67]:
file9 = './Datasets/Lending Club/lendclub_17-q1.csv'

In [68]:
df9 = pd.read_csv(file9, skiprows=1, skipfooter=2)

  """Entry point for launching an IPython kernel.


In [69]:
df9.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,15000,15000,15000,36 months,5.32%,451.73,A,A1,...,,,Cash,N,,,,,,
1,,,17000,17000,17000,36 months,7.49%,528.73,A,A4,...,,,Cash,N,,,,,,
2,,,20000,20000,20000,36 months,5.32%,602.3,A,A1,...,,,Cash,N,,,,,,
3,,,16000,16000,16000,60 months,12.74%,361.93,C,C1,...,,,Cash,N,,,,,,
4,,,2000,2000,2000,36 months,16.99%,71.3,D,D1,...,,,Cash,N,,,,,,


In [70]:
df9.tail()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
96774,,,10400,10400,10400,36 months,14.99%,360.47,C,C4,...,,,Cash,N,,,,,,
96775,,,15000,15000,15000,36 months,12.74%,503.54,C,C1,...,,,Cash,N,,,,,,
96776,,,10000,10000,10000,36 months,8.24%,314.48,B,B1,...,,,Cash,N,,,,,,
96777,,,6325,6325,6325,36 months,15.99%,222.34,C,C5,...,,,Cash,N,,,,,,
96778,,,15625,15625,15625,60 months,28.69%,493.03,F,F1,...,,,Cash,N,,,,,,


In [71]:
df9.shape

(96779, 145)

In [72]:
df9.to_sql(name='LC_17-q1', con=engine, if_exists='replace', index = False)

In [73]:
SQL_STRING = '''

select * from "LC_17-q1"
'''

df = pd.read_sql(SQL_STRING, con=engine)
df.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,15000,15000,15000,36 months,5.32%,451.73,A,A1,...,,,Cash,N,,,,,,
1,,,17000,17000,17000,36 months,7.49%,528.73,A,A4,...,,,Cash,N,,,,,,
2,,,20000,20000,20000,36 months,5.32%,602.3,A,A1,...,,,Cash,N,,,,,,
3,,,16000,16000,16000,60 months,12.74%,361.93,C,C1,...,,,Cash,N,,,,,,
4,,,2000,2000,2000,36 months,16.99%,71.3,D,D1,...,,,Cash,N,,,,,,


In [74]:
df.shape

(96779, 145)

## 2017 Q2

In [75]:
file10 = './Datasets/Lending Club/lendclub_17-q2.csv'

In [76]:
df10 = pd.read_csv(file10, skiprows=1, skipfooter=2)

  """Entry point for launching an IPython kernel.


In [77]:
df10.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,12000,12000,12000,60 months,30.65%,393.05,F,F4,...,,,Cash,N,,,,,,
1,,,4000,4000,4000,36 months,5.32%,120.46,A,A1,...,,,Cash,N,,,,,,
2,,,6000,6000,6000,36 months,9.44%,192.03,B,B1,...,,,Cash,N,,,,,,
3,,,25000,25000,25000,60 months,15.05%,595.41,C,C4,...,,,Cash,N,,,,,,
4,,,10000,10000,10000,36 months,7.21%,309.74,A,A3,...,,,Cash,N,,,,,,


In [78]:
df10.tail()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
105446,,,24000,24000,24000,60 months,23.99%,690.3,E,E2,...,,,Cash,N,,,,,,
105447,,,10000,10000,10000,36 months,7.99%,313.32,A,A5,...,,,Cash,N,,,,,,
105448,,,10050,10050,10050,36 months,16.99%,358.26,D,D1,...,,,Cash,N,,,,,,
105449,,,6000,6000,6000,36 months,11.44%,197.69,B,B4,...,,,Cash,N,,,,,,
105450,,,30000,30000,30000,60 months,25.49%,889.18,E,E4,...,,,Cash,N,,,,,,


In [79]:
df10.shape

(105451, 145)

In [80]:
df10.to_sql(name='LC_17-q2', con=engine, if_exists='replace', index = False)

In [81]:
SQL_STRING = '''

select * from "LC_17-q2"
'''

df = pd.read_sql(SQL_STRING, con=engine)
df.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,12000,12000,12000,60 months,30.65%,393.05,F,F4,...,,,Cash,N,,,,,,
1,,,4000,4000,4000,36 months,5.32%,120.46,A,A1,...,,,Cash,N,,,,,,
2,,,6000,6000,6000,36 months,9.44%,192.03,B,B1,...,,,Cash,N,,,,,,
3,,,25000,25000,25000,60 months,15.05%,595.41,C,C4,...,,,Cash,N,,,,,,
4,,,10000,10000,10000,36 months,7.21%,309.74,A,A3,...,,,Cash,N,,,,,,


In [82]:
df.shape

(105451, 145)

## 2017 Q3

In [83]:
file11 = './Datasets/Lending Club/lendclub_17-q3.csv'

In [84]:
df11 = pd.read_csv(file11, skiprows=1, skipfooter=2)

  """Entry point for launching an IPython kernel.


In [85]:
df11.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,7000,7000,7000,36 months,7.97%,219.26,A,A5,...,,,Cash,N,,,,,,
1,,,12000,12000,12000,36 months,7.97%,375.88,A,A5,...,,,Cash,N,,,,,,
2,,,16000,16000,16000,36 months,7.97%,501.17,A,A5,...,,,Cash,N,,,,,,
3,,,32000,32000,32000,36 months,11.99%,1062.71,B,B5,...,,,Cash,N,,,,,,
4,,,33000,33000,33000,36 months,7.21%,1022.12,A,A3,...,,,Cash,N,,,,,,


In [86]:
df11.tail()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
122696,,,6000,6000,6000,36 months,23.88%,235.02,E,E2,...,,,Cash,N,,,,,,
122697,,,20000,20000,20000,60 months,26.30%,602.37,E,E5,...,,,Cash,N,,,,,,
122698,,,35000,35000,35000,60 months,30.89%,1151.58,G,G3,...,,,Cash,N,,,,,,
122699,,,30775,30775,30525,60 months,30.65%,1008.0,F,F4,...,,,Cash,N,,,,,,
122700,,,18900,18900,18900,60 months,30.94%,622.44,G,G4,...,,,Cash,N,,,,,,


In [87]:
df11.shape

(122701, 145)

In [88]:
df11.to_sql(name='LC_17-q3', con=engine, if_exists='replace', index = False)

In [89]:
SQL_STRING = '''

select * from "LC_17-q3"
'''

df = pd.read_sql(SQL_STRING, con=engine)
df.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,7000,7000,7000,36 months,7.97%,219.26,A,A5,...,,,Cash,N,,,,,,
1,,,12000,12000,12000,36 months,7.97%,375.88,A,A5,...,,,Cash,N,,,,,,
2,,,16000,16000,16000,36 months,7.97%,501.17,A,A5,...,,,Cash,N,,,,,,
3,,,32000,32000,32000,36 months,11.99%,1062.71,B,B5,...,,,Cash,N,,,,,,
4,,,33000,33000,33000,36 months,7.21%,1022.12,A,A3,...,,,Cash,N,,,,,,


In [90]:
df.shape

(122701, 145)

## 2017 Q4

In [91]:
file12 = './Datasets/Lending Club/lendclub_17-q4.csv'

In [92]:
df12 = pd.read_csv(file12, skiprows=1, skipfooter=2)

  """Entry point for launching an IPython kernel.


In [93]:
df12.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,35000,35000,35000.0,60 months,11.99%,778.38,B,B5,...,,,Cash,N,,,,,,
1,,,6000,6000,6000.0,36 months,7.35%,186.23,A,A4,...,,,Cash,N,,,,,,
2,,,40000,40000,40000.0,36 months,6.08%,1218.33,A,A2,...,,,Cash,N,,,,,,
3,,,10000,10000,10000.0,60 months,23.88%,286.99,E,E2,...,,,Cash,N,,,,,,
4,,,27000,27000,27000.0,60 months,9.93%,572.75,B,B2,...,,,Cash,N,,,,,,


In [94]:
df12.tail()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
118643,,,12000,12000,12000.0,60 months,14.08%,279.72,C,C3,...,,,Cash,N,,,,,,
118644,,,12000,12000,12000.0,60 months,25.82%,358.01,E,E4,...,,,Cash,N,,,,,,
118645,,,10000,10000,10000.0,36 months,11.99%,332.1,B,B5,...,,,Cash,N,,,,,,
118646,,,12000,12000,12000.0,60 months,21.45%,327.69,D,D5,...,,,Cash,N,,,,,,
118647,,,16550,16550,16550.0,60 months,21.45%,451.94,D,D5,...,,,Cash,N,,,,,,


In [95]:
df12.shape

(118648, 145)

In [96]:
df12.to_sql(name='LC_17-q4', con=engine, if_exists='replace', index = False)

In [97]:
SQL_STRING = '''

select * from "LC_17-q4"
'''

df = pd.read_sql(SQL_STRING, con=engine)
df.head()

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,,,35000,35000,35000.0,60 months,11.99%,778.38,B,B5,...,,,Cash,N,,,,,,
1,,,6000,6000,6000.0,36 months,7.35%,186.23,A,A4,...,,,Cash,N,,,,,,
2,,,40000,40000,40000.0,36 months,6.08%,1218.33,A,A2,...,,,Cash,N,,,,,,
3,,,10000,10000,10000.0,60 months,23.88%,286.99,E,E2,...,,,Cash,N,,,,,,
4,,,27000,27000,27000.0,60 months,9.93%,572.75,B,B2,...,,,Cash,N,,,,,,


In [98]:
df.shape

(118648, 145)

---
All 12 csv files are now in SQL tables. I can sum the lengths of the dataframes to check how large the table would be if I concatonated them all into one large table.

---

In [99]:
shape = ((df1.shape[0] + df2.shape[0] + df3.shape[0] + df4.shape[0] + df5.shape[0] + df6.shape[0] + df7.shape[0] + df8.shape[0] + df9.shape[0] + df10.shape[0] + df11.shape[0] +df12.shape[0]), df.shape[1])

In [100]:
print(shape)

(1765427, 145)


---
The total size would be nearly 1.8 million rows.

---

In [101]:
# Test an SQL query
SQL_STRING = '''

select loan_status from "LC_16-q2"
'''

df = pd.read_sql(SQL_STRING, con=engine)
df.head()

Unnamed: 0,loan_status
0,Current
1,Charged Off
2,Current
3,Current
4,Fully Paid


## Concatonation

---
I have now pushed all 12 csv files into SQL tables. Now I need to vertically concatonate them to form one large table and then push that into a new SQL table.

First read in all 12 sql tables:

---

In [4]:
# (2007-11)
SQL_STRING = '''

select * from "LC_07-11"

'''

lc1 = pd.read_sql(SQL_STRING, con=engine)
lc1.shape

(42536, 145)

In [5]:
# (2012-13)
SQL_STRING = '''

select * from "LC_12-13"

'''

lc2 = pd.read_sql(SQL_STRING, con=engine)
lc2.shape

(188181, 145)

In [6]:
# (2014)
SQL_STRING = '''

select * from "LC_14"

'''

lc3 = pd.read_sql(SQL_STRING, con=engine)
lc3.shape

(235629, 145)

In [7]:
# (2015)
SQL_STRING = '''

select * from "LC_15"

'''

lc4 = pd.read_sql(SQL_STRING, con=engine)
lc4.shape

(421095, 145)

In [8]:
# (2016-Q1)
SQL_STRING = '''

select * from "LC_16-q1"

'''

lc5 = pd.read_sql(SQL_STRING, con=engine)
lc5.shape

(133887, 145)

In [9]:
# (2016-Q2)
SQL_STRING = '''

select * from "LC_16-q2"

'''

lc6 = pd.read_sql(SQL_STRING, con=engine)
lc6.shape

(97854, 145)

In [10]:
# (2016-Q3)
SQL_STRING = '''

select * from "LC_16-q3"

'''

lc7 = pd.read_sql(SQL_STRING, con=engine)
lc7.shape

(99120, 145)

In [11]:
# (2016-Q4)
SQL_STRING = '''

select * from "LC_16-q4"

'''

lc8 = pd.read_sql(SQL_STRING, con=engine)
lc8.shape

(103546, 145)

In [12]:
# (2017-Q1)
SQL_STRING = '''

select * from "LC_17-q1"

'''

lc9 = pd.read_sql(SQL_STRING, con=engine)
lc9.shape

(96779, 145)

In [13]:
# (2017-Q2)
SQL_STRING = '''

select * from "LC_17-q2"

'''

lc10 = pd.read_sql(SQL_STRING, con=engine)
lc10.shape

(105451, 145)

In [14]:
# (2017-Q3)
SQL_STRING = '''

select * from "LC_17-q3"

'''

lc11 = pd.read_sql(SQL_STRING, con=engine)
lc11.shape

(122701, 145)

In [15]:
# (2017-Q4)
SQL_STRING = '''

select * from "LC_17-q4"

'''

lc12 = pd.read_sql(SQL_STRING, con=engine)
lc12.shape

(118648, 145)

---
Now that I have read all tables into pandas, I need to vertically concatonate them to form 1 large dataframe.

---

In [16]:
df = pd.concat([lc1, lc2], axis=0)
df.shape

(230717, 145)

In [17]:
df = pd.concat([df, lc3], axis=0)
df.shape

(466346, 145)

In [18]:
df = pd.concat([df, lc4], axis=0)
df.shape

(887441, 145)

In [19]:
df = pd.concat([df, lc5], axis=0)
df.shape

(1021328, 145)

In [20]:
df = pd.concat([df, lc6], axis=0)
df.shape

(1119182, 145)

In [21]:
df = pd.concat([df, lc7], axis=0)
df.shape

(1218302, 145)

In [22]:
df = pd.concat([df, lc8], axis=0)
df.shape

(1321848, 145)

In [23]:
df = pd.concat([df, lc9], axis=0)
df.shape

(1418627, 145)

In [24]:
df = pd.concat([df, lc10], axis=0)
df.shape

(1524078, 145)

In [25]:
df = pd.concat([df, lc11], axis=0)
df.shape

(1646779, 145)

In [26]:
df = pd.concat([df, lc12], axis=0)
df.shape

(1765427, 145)

---
As the id column is empty, I will set it to be the index +1 (+1 because the index starts at 0). That way I will have a unique id for each record in case I need one.

---

In [27]:
df.id = df.index+1

In [28]:
df.head(50)

Unnamed: 0,id,member_id,loan_amnt,funded_amnt,funded_amnt_inv,term,int_rate,installment,grade,sub_grade,...,hardship_payoff_balance_amount,hardship_last_payment_amount,disbursement_method,debt_settlement_flag,debt_settlement_flag_date,settlement_status,settlement_date,settlement_amount,settlement_percentage,settlement_term
0,1,,5000.0,5000.0,4975.0,36 months,10.65%,162.87,B,B2,...,,,Cash,N,,,,,,
1,2,,2500.0,2500.0,2500.0,60 months,15.27%,59.83,C,C4,...,,,Cash,N,,,,,,
2,3,,2400.0,2400.0,2400.0,36 months,15.96%,84.33,C,C5,...,,,Cash,N,,,,,,
3,4,,10000.0,10000.0,10000.0,36 months,13.49%,339.31,C,C1,...,,,Cash,N,,,,,,
4,5,,3000.0,3000.0,3000.0,60 months,12.69%,67.79,B,B5,...,,,Cash,N,,,,,,
5,6,,5000.0,5000.0,5000.0,36 months,7.90%,156.46,A,A4,...,,,Cash,N,,,,,,
6,7,,7000.0,7000.0,7000.0,60 months,15.96%,170.08,C,C5,...,,,Cash,N,,,,,,
7,8,,3000.0,3000.0,3000.0,36 months,18.64%,109.43,E,E1,...,,,Cash,N,,,,,,
8,9,,5600.0,5600.0,5600.0,60 months,21.28%,152.39,F,F2,...,,,Cash,N,,,,,,
9,10,,5375.0,5375.0,5350.0,60 months,12.69%,121.45,B,B5,...,,,Cash,N,,,,,,


---
I can now push the full dataframe into a new SQL table.

---

In [29]:
df.to_sql(name='Lending Club', con=engine, if_exists='replace', index = False)