## Example 1. Create a DataFrame using a dictionary of Series.

In [1]:
import pandas as pd

In [2]:
items = {'Bob' : pd.Series(data = [123, 34, 78], index =['bike', 'pants', 'watch']), 
         'Alice' : pd.Series(data = [30, 304, 708, 23], index =['book', 'glasses', 'bike', 'pants']) }

In [3]:
shopping_carts = pd.DataFrame(items)

In [4]:
shopping_carts

Unnamed: 0,Bob,Alice
bike,123.0,708.0
book,,30.0
glasses,,304.0
pants,34.0,23.0
watch,78.0,


There are several things to notice here, as explained below:

1. We see that DataFrames are displayed in tabular form, much like an Excel spreadsheet, with the labels of rows and columns in bold.

2. Also, notice that the row labels of the DataFrame are built from the union of the index labels of the two Pandas Series we used to construct the dictionary. And the column labels of the DataFrame are taken from the keys of the dictionary.

3. Another thing to notice is that the columns are arranged alphabetically and not in the order given in the dictionary. We will see later that this won't happen when we load data into a DataFrame from a data file.

4. The last thing we want to point out is that we see some NaN values appear in the DataFrame. NaN stands for Not a Number, and is Pandas way of indicating that it doesn't have a value for that particular row and column index. For example, if we look at the column of Alice, we see that it has NaN in the watch index. You can see why this is the case by looking at the dictionary we created at the beginning. We clearly see that the dictionary has no item for Alice labeled watches. So whenever a DataFrame is created, if a particular column doesn't have values for a particular row index, Pandas will put a NaN value there.

5. If we were to feed this data into a machine learning algorithm we will have to remove these NaN values first. 

## Example 2. DataFrame assigns the numerical row indexes by default.

In [8]:
data = {
    'Bob' : pd.Series([23, 12, 22]),
    'Alice' : pd.Series([20, 11, 440, 111])
}

In [9]:
df = pd.DataFrame(data)

In [10]:
df

Unnamed: 0,Bob,Alice
0,23.0,20
1,12.0,11
2,22.0,440
3,,111


## Example 3. Demonstrate a few attributes of DataFrame

In [11]:
shopping_carts.shape

(5, 2)

In [12]:
shopping_carts.ndim

2

In [13]:
shopping_carts.values

array([[123., 708.],
       [ nan,  30.],
       [ nan, 304.],
       [ 34.,  23.],
       [ 78.,  nan]])

In [14]:
shopping_carts.index

Index(['bike', 'book', 'glasses', 'pants', 'watch'], dtype='object')

In [15]:
shopping_carts.columns

Index(['Bob', 'Alice'], dtype='object')

Unnamed: 0,Bob,Alice
0,23.0,20
1,12.0,11
2,22.0,440
3,,111


When creating the shopping_carts DataFrame we passed the entire dictionary to the pd.DataFrame() function. However, there might be cases when you are only interested in a subset of the data. Pandas allows us to select which data we want to put into our DataFrame by means of the keywords columns and index. Let's see some examples:

In [19]:
bob_shopping_cart = pd.DataFrame(items, columns=['Bob'])

In [20]:
bob_shopping_cart

Unnamed: 0,Bob
bike,123
pants,34
watch,78


## Example 4. Selecting specific rows of a DataFrame

In [21]:
sel_shopping_cart = pd.DataFrame(items, index = ['pants', 'book'])

In [22]:
sel_shopping_cart

Unnamed: 0,Bob,Alice
pants,34.0,23
book,,30


## Example 5. Selecting specific columns of a DataFrame

In [23]:
alice_sel_shopping_cart = pd.DataFrame(items, index = ['glass', 'bike'], columns = ['Alice'])

In [24]:
alice_sel_shopping_cart

Unnamed: 0,Alice
glass,
bike,708.0


## Example 6. Create a DataFrame using a dictionary of lists

In [25]:
data = {
    'Integers' : [1, 2, 3],
    'Floats' : [4.5, 8.2, 9.6]
}

In [26]:
df = pd.DataFrame(data)

In [27]:
df

Unnamed: 0,Integers,Floats
0,1,4.5
1,2,8.2
2,3,9.6


## Example 7. Create a DataFrame using a dictionary of lists, and custom row-indexes (labels)

In [28]:
data = {
    'Integers' : [1, 2, 3],
    'Floats' : [4.5, 8.2, 9.6]
}

In [30]:
df = pd.DataFrame(data, index = ['lebel1', 'lebel2', 'lebel3'])

In [31]:
df

Unnamed: 0,Integers,Floats
lebel1,1,4.5
lebel2,2,8.2
lebel3,3,9.6


## Example 8. Create a DataFrame using a list of dictionaries

In [35]:
items2 = [
    {'bikes': 20, 'pants': 30, 'watches': 35},
    {'watches' : 10, 'glasses': 50, 'bikes': 15, 'pants': 5}
]

In [36]:
store_items = pd.DataFrame(items2)

In [42]:
store_items

Unnamed: 0,bikes,pants,watches,glasses
Store1,20,30,35,
Store2,15,5,10,50.0


## Example 9. Create a DataFrame using a of list of dictionaries, and custom row-indexes (labels)

In [40]:
store_items = pd.DataFrame(items2, index = ['Store1', 'Store2'])

In [41]:
store_items

Unnamed: 0,bikes,pants,watches,glasses
Store1,20,30,35,
Store2,15,5,10,50.0


## Additional Reading - Pandas Documentation

1. Refer to the Intro to data structures for an overview of both the data structures - Series and DataFrame. ---> https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html#intro-to-data-structures 

2. Refer to the Attributes and underlying data section in the DataFrame documentation.