This notebook is the reference code for getting input and output, `pandas` can read a variety of file types using its `pd.read_ (methods)`. Let's take a look at the most common data types.

In [74]:
import numpy as np
import pandas as pd

Keep in mind that as we reference the files such as the example CSV file and the example Excel file. You'll need to make sure they're in the same location as your notebook is.

In order to check the location of your notebook. You should do the following:

In [75]:
pwd

'd:\\Coding time\\Data Science and Machine Learning\\Self-study Data Science\\Data Science and Machine Learning\\2 - Python for Data Analyst\\Pandas'

### **CSV.**

In [76]:
# CSV Input
df = pd.read_csv('example.csv')
df

Unnamed: 0,a,b,c,d
0,0,1,2,3
1,4,5,6,7
2,8,9,10,11
3,12,13,14,15


In [77]:
# CSV Output
df.to_csv('My_output.csv')
pd.read_csv('My_output.csv')

Unnamed: 0.1,Unnamed: 0,a,b,c,d
0,0,0,1,2,3
1,1,4,5,6,7
2,2,8,9,10,11
3,3,12,13,14,15


In [78]:
df.to_csv('My_output.csv', index = False)
pd.read_csv('My_output.csv') # or you can remove '.csv' and it will still work

Unnamed: 0,a,b,c,d
0,0,1,2,3
1,4,5,6,7
2,8,9,10,11
3,12,13,14,15


I will explain a bit:

+ This line of code is used to export data from DataFrame into a CSV file. In the code above, `df` is the name of the DataFrame you want to export data from. The `to_csv()` method is called on this DataFrame to perform the data export.
+ Parameter `My_output`, this is the name of the CSV file you want to create. In this case, the file will be named "My_output.csv".
+ Parameter `index`:
  +  If `index = True` (this is the default value), then the DataFrame's index column will be written to the CSV file. This can be useful if your index column contains important information that you want to retain when saving the DataFrame to a CSV file.
  + If `index = False`, the DataFrame's index column will not be written to the CSV file. This can be useful if your index column does not contain important information, or if you do not want the index column to appear in your CSV file.

### **Excel.**

`pandas` can read and write excel files, keep in mind, this only imports data. Not formulas or images, having images or macros may cause this `read_excel` method to crash.

In [79]:
# Excel Input
df = pd.read_excel('Excel_Sample.xlsx', sheet_name = 'Sheet1') # sheet_name means the name of the sheet you want to read
df

Unnamed: 0.1,Unnamed: 0,a,b,c,d
0,0,0,1,2,3
1,1,4,5,6,7
2,2,8,9,10,11
3,3,12,13,14,15


In [80]:
# Excel Output
df.to_excel('Excel_Sample1.xlsx', sheet_name = 'NewSheet', index = False) # sheet_name means the name of the sheet you want to write
pd.read_excel('Excel_Sample1.xlsx', sheet_name = 'NewSheet')

Unnamed: 0.1,Unnamed: 0,a,b,c,d
0,0,0,1,2,3
1,1,4,5,6,7
2,2,8,9,10,11
3,3,12,13,14,15


### **HTML.**

`pandas.read_html` function will read tables off of a webpage and return a list of DataFrame objects:

In [84]:
df = pd.read_html('https://www.fdic.gov/resources/resolutions/bank-failures/failed-bank-list/')
df

[                         Bank NameBank       CityCity StateSt  CertCert  \
 0                        Citizens Bank       Sac City      IA      8758   
 1             Heartland Tri-State Bank        Elkhart      KS     25851   
 2                  First Republic Bank  San Francisco      CA     59017   
 3                       Signature Bank       New York      NY     57053   
 4                  Silicon Valley Bank    Santa Clara      CA     24735   
 ..                                 ...            ...     ...       ...   
 563                 Superior Bank, FSB       Hinsdale      IL     32646   
 564                Malta National Bank          Malta      OH      6629   
 565    First Alliance Bank & Trust Co.     Manchester      NH     34264   
 566  National State Bank of Metropolis     Metropolis      IL      3815   
 567                   Bank of Honolulu       Honolulu      HI     21029   
 
                  Acquiring InstitutionAI Closing DateClosing  FundFund  
 0          

In [86]:
type(df)

list

In [94]:
df[0]

Unnamed: 0,Bank NameBank,CityCity,StateSt,CertCert,Acquiring InstitutionAI,Closing DateClosing,FundFund
0,Citizens Bank,Sac City,IA,8758,Iowa Trust & Savings Bank,"November 3, 2023",10545
1,Heartland Tri-State Bank,Elkhart,KS,25851,"Dream First Bank, N.A.","July 28, 2023",10544
2,First Republic Bank,San Francisco,CA,59017,"JPMorgan Chase Bank, N.A.","May 1, 2023",10543
3,Signature Bank,New York,NY,57053,"Flagstar Bank, N.A.","March 12, 2023",10540
4,Silicon Valley Bank,Santa Clara,CA,24735,First–Citizens Bank & Trust Company,"March 10, 2023",10539
...,...,...,...,...,...,...,...
563,"Superior Bank, FSB",Hinsdale,IL,32646,"Superior Federal, FSB","July 27, 2001",6004
564,Malta National Bank,Malta,OH,6629,North Valley Bank,"May 3, 2001",4648
565,First Alliance Bank & Trust Co.,Manchester,NH,34264,Southern New Hampshire Bank & Trust,"February 2, 2001",4647
566,National State Bank of Metropolis,Metropolis,IL,3815,Banterra Bank of Marion,"December 14, 2000",4646


In [103]:
df[0].head()
# df[0].head(50): to see the first 50 rows

Unnamed: 0,Bank NameBank,CityCity,StateSt,CertCert,Acquiring InstitutionAI,Closing DateClosing,FundFund
0,Citizens Bank,Sac City,IA,8758,Iowa Trust & Savings Bank,"November 3, 2023",10545
1,Heartland Tri-State Bank,Elkhart,KS,25851,"Dream First Bank, N.A.","July 28, 2023",10544
2,First Republic Bank,San Francisco,CA,59017,"JPMorgan Chase Bank, N.A.","May 1, 2023",10543
3,Signature Bank,New York,NY,57053,"Flagstar Bank, N.A.","March 12, 2023",10540
4,Silicon Valley Bank,Santa Clara,CA,24735,First–Citizens Bank & Trust Company,"March 10, 2023",10539


In [104]:
df[1] # if there are multiple tables in the page, you can access them by using df[1], df[2], etc.

IndexError: list index out of range