<a class='anchor' id='top'></a>
<h1>Different Ways of Creating DataFrames</h1>

<h2> Contents </h2>

* [Using CSV](#csv)
* [Using Excel](#excel)
* [Python Dict](#python)
* [From Tuples](#tuples)
* [List of Dicts](#list)
* [Other Ways](#other)

<hr>

- [Back to Top](#top)
<a class='anchor' id='csv'></a>
<h2>Using CSV</h2>



In [39]:
import pandas as pd

dfWtr = pd.read_csv(r'C:\Users\Work\Desktop\Python Lessons\Data Science\Data Science w Py Course\Data For Use\weather.csv')
dfWtr

Unnamed: 0,day,temperature,windspeed,event
0,1/1/2017,32,6,Rain
1,1/2/2017,35,7,Sunny
2,1/3/2017,28,2,Snow
3,1/4/2017,24,7,Snow
4,1/5/2017,32,4,Rain
5,1/6/2017,32,2,Sunny


<hr>

- [Back to Top](#top)
<a class='anchor' id='excel'></a>
<h2>Using Excel</h2>

The format for opening Excel files is: 

<code> pd.read_excel(<i>param1,param2</i>)</code> WHERE:
- pd.read_excel() -> opens the file
- param1 -> is the excel filename, i.e. -> 'FILE_NAME'.xlsx
- param2 -> is the sheet name, i.e. -> 'SHEET1'

In [37]:
colnames=['public_carriers', 'factoring_co']
jkrf = pd.read_excel(r'C:\Users\Work\Documents\CalTemp\Business\JKR Freight\Carriers\CarrierDatabase.xlsx','Sheet1', header=None, names=colnames)
jkrf = jkrf.drop(labels=range(0,6), axis=0) #drop shitty values
jkrf.factoring_co = jkrf.factoring_co.fillna('redacted') #replace NaN values
jkrf.set_index('public_carriers', inplace=True) #set index to carrier names
jkrf.factoring_co[jkrf['factoring_co']!='redacted'] #all carriers who don't have redacted factoring companies.


public_carriers
HMD LLC                                           HMD
Harika Trucking                        Royal Invoices
Elephant Express Inc                       JD Factors
Old Iron Trucking                       RTS Financial
Run Direct                              RTS Financial
C1 Transportation                       RTS Financial
PC Logistics Group INC.     Compass Funding Solutions
New Era Trucking           CarrierNet Group Financial
Name: factoring_co, dtype: object

<hr>

- [Back to Top](#top)
<a class='anchor' id='python'></a>
<h2>Using Python Dictionary</h2>

To create a DataFrame either on-the-fly or from data that can't be parsed, you can create a dictionary: 

In [43]:
contact_info = {
    'Contact Info':['Email', 'Phone', 'Website', 'LinkedIn'],
    'Calvin King': ['mrcking88@gmail.com','336.549.1149','github.com/xxkohxx','/in/expCalvinKing'],
    'Jason Borne': ['jbourne@topsecret.com', 'xxx-xxx-xxxx','bourneidentity.com','redacted'],
    'Michael Scott': ['mscott@dundermifflin', '800-627-0114','www.sceendy.com','redacted'],
    'Jim Halpert': ['jhalpert@dundermifflin','800-627-0114','www.sceendy.com','redacted'],
}

pprCo = pd.DataFrame(contact_info)

pprCo

Unnamed: 0,Contact Info,Calvin King,Jason Borne,Michael Scott,Jim Halpert
0,Email,mrcking88@gmail.com,jbourne@topsecret.com,mscott@dundermifflin,jhalpert@dundermifflin
1,Phone,336.549.1149,xxx-xxx-xxxx,800-627-0114,800-627-0114
2,Website,github.com/xxkohxx,bourneidentity.com,www.sceendy.com,www.sceendy.com
3,LinkedIn,/in/expCalvinKing,redacted,redacted,redacted


<hr>

- [Back to Top](#top)
<a class='anchor' id='tuples'></a>
<h2>Using Tuples [ (List) ]</h2>

By using a tuple list, you can also create a DataFrame like: 

<code>Tuple = [ (Col1,Col2,Col3)]</code> and create the DataFrame like:

<code>df = pd.DataFrame(Tuple, columns=['Col1_Name', 'Col2_Name', 'Col3_Name'])</code>

In [46]:
tupl = [
    ('1/1/2017',32,6,'Rain'),
    ('1/2/2017',35,7,'Sunny'),
    ('1/3/2017',28,2,'Snow')
]

tupl = pd.DataFrame(tupl, columns=['day','temperature','windspeed','event'])
tupl

Unnamed: 0,day,temperature,windspeed,event
0,1/1/2017,32,6,Rain
1,1/2/2017,35,7,Sunny
2,1/3/2017,28,2,Snow


<hr>

- [Back to Top](#top)
<a class='anchor' id='list'></a>
<h2>Using List of Dictionaries</h2>

A final way is to use a list of dictionaries like: 

<code>var_name = [ {COL1_Title : 'COL2_Data', 'COL2_Title' : 'COL2_Data' }  ]</code> and calling it like: 

<code> df = pd.DataFrame(var_name)</code>

In [48]:
var1 = [
    {'Fruit':'Melon', 'Size':'5cm','Color':'Beige','Sweetness':'Semi'},
    {'Fruit':'Tomato', 'Size':'2cm','Color':'Red','Sweetness':'Semi'},
    {'Fruit':'Pear', 'Size':'3cm','Color':'Autumn','Sweetness':'Semi'},
    {'Fruit':'Grape', 'Size':'1cm','Color':'Purple','Sweetness':'Very'}
]

var1 = pd.DataFrame(var1)
var1

Unnamed: 0,Fruit,Size,Color,Sweetness
0,Melon,5cm,Beige,Semi
1,Tomato,2cm,Red,Semi
2,Pear,3cm,Autumn,Semi
3,Grape,1cm,Purple,Very


<hr>

- [Back to Top](#top)
<a class='anchor' id='other'></a>
<h2>Other Ways</h2>

<a href="https://pandas.pydata.org/docs/user_guide/io.html"><h3>Pandas.Io Documentation</h3></a>
