In [1]:
person = {
    'first': 'Corey',
    'last': 'Schafer',
    'email': 'CoreyMSchafer@gmail.com'
}

In [2]:
people = {
    'first': ['Corey'],
    'last': ['Schafer'],
    'email': ['CoreyMSchafer@gmail.com']
}

In [3]:
people = {
    'first': ['Corey', 'Jane', 'John'],
    'last': ['Schafer', 'Doe', 'Doe'],
    'email': ['CoreyMSchafer@gmail.com', 'JaneDoe@email.com', 'JohnDoe@email.com']
}

This will be our representation how we can store information using just Python.

In [4]:
# Get column email with all its rows.
people['email']

['CoreyMSchafer@gmail.com', 'JaneDoe@email.com', 'JohnDoe@email.com']

Creating a DataFrame from this dictionary and see what this looks like:

In [5]:
import pandas as pd

In [6]:
df = pd.DataFrame(people)
df

Unnamed: 0,first,last,email
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,John,Doe,JohnDoe@email.com


We can see that now our DataFrame is representing this in a way to where we do have rows and columns that we can visualize. So we get these people printed out in a nice table of rows and columns.

Now we also have these over here to the far left that don't have column names (this 0, 1, and 2), these are indexes. 

## Accesing information within the data frame.

In [7]:
# Accesing to the values of a single column.
df['email']

0    CoreyMSchafer@gmail.com
1          JaneDoe@email.com
2          JohnDoe@email.com
Name: email, dtype: object

This is returning a series  and we can see this if we check the type 

In [8]:
type(df['email'])

pandas.core.series.Series

We can see that this is pandas.core.series.Series so this is a Series object. 

So, what is a Series? It is still basically a list of data but just like a data frame it has a lot more functionality than just that (It's a 1-dimensional array). Basically it is rows of data.

Again, you can think of a data frame as being rows and columns and a series as being rows of a single column. A dataframe is basically a container for multiple of these series objects.

We can see first name, last name, and email as a series.

This series also has an index as well just like our dataframe did.

In [9]:
df.email

0    CoreyMSchafer@gmail.com
1          JaneDoe@email.com
2          JohnDoe@email.com
Name: email, dtype: object

Using brackets notation is kinda better since you can have a column and a method with the same name.

Example: a df has a method called count, if you have a column called count too, when you enter df.count actually you are calling to the method and not to the column.

## Other DataFrames functionality.

In [11]:
# Access to multiple columns.
df[['last', 'email']]

Unnamed: 0,last,email
0,Schafer,CoreyMSchafer@gmail.com
1,Doe,JaneDoe@email.com
2,Doe,JohnDoe@email.com


This is no longer a series because remember, a series is a single column of rows (1-dimensional array). In this case it is returning another DataFrame (__filtered down DataFrame__). 

In [12]:
# See all of the columns easily.
# This gives us all of our columns. Our columns are an index of first, last, and email.
df.columns

Index(['first', 'last', 'email'], dtype='object')

In [13]:
type(df.columns)

pandas.core.indexes.base.Index

#### Getting rows

In order to get rows we can use __loc__ and __iloc__ indexers:

In [31]:
# iloc allows us to access rows by integer location hence the name. iloc = integer location
# Get the first row.

df.iloc[0]

first                      Corey
last                     Schafer
email    CoreyMSchafer@gmail.com
Name: 0, dtype: object

What we did is it returns a series that contains the values of that row of data which is the first name, last name, and email of the first person. The index in this result is the column names so that we know what those values are.

Whenever we're actually accessing a row it's going to set that index to the column name so that we know what those vales are because if this just say 0, 1, and 2 then we might not know what these values are.

Just like when we selected multiple columns, we can select multiple rows as well by passing in a list of integers, so if we want the 1st and 2nd row then we can say:

In [29]:
df.iloc[[1, 2]]

Unnamed: 0,first,last,email
1,Jane,Doe,JaneDoe@email.com
2,John,Doe,JohnDoe@email.com


We can see that now we're geeting a DataFrame with these multiple rows.

Now with these __iloc__ and __loc__ indexers we can also select columns as well and that is going to be the second value that we pass into these outer brackets.

So if we thought of __iloc__ and __loc__ as functions the we can think of the rows that we want as the first argument and the columns as the second argument.

- __iloc__: We __can't__ specify an actual the column name because this use integers (integer locations) so these are for intgers only.

In [32]:
# Getting the last column (email column)
#type(df.iloc[[0, 1], -1])
df.iloc[[0, 1], -1]

# type(df.iloc[[0, 1], -1]) This way it'll return a series, here we are telling it that returns a Series

# type(df.iloc[[0, 1], [-1]]) This way it'll return a DataFrame.

0    CoreyMSchafer@gmail.com
1          JaneDoe@email.com
Name: email, dtype: object

#### loc

We are going to be searching by label. When we are talking about label for rows these will be the indexes.

In [33]:
df

Unnamed: 0,first,last,email
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,John,Doe,JohnDoe@email.com


In [34]:
# Get the row with the label of 0
df.loc[0]

first                      Corey
last                     Schafer
email    CoreyMSchafer@gmail.com
Name: 0, dtype: object

In [35]:
# Multiple rows 1st and 2nd rows
df.loc[[0, 1]]

Unnamed: 0,first,last,email
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com


In [36]:
# Get column email of those 2 rows, WITH LOC WE USE THE LABELS!
df.loc[[0, 1], 'email']

0    CoreyMSchafer@gmail.com
1          JaneDoe@email.com
Name: email, dtype: object

In [37]:
# Get column last name and email of those 2 rows, WITH LOC WE USE THE LABELS!
# Depend on the order of the list is the order of the info to be shown.
df.loc[[0, 1], ['last', 'email']]

Unnamed: 0,last,email
0,Schafer,CoreyMSchafer@gmail.com
1,Doe,JaneDoe@email.com
