# How do I select a pandas Series from a DataFrame?

In [1]:
import pandas as pd

There are 2 basic object types in pandas:
* **DataFrame** --> table of rows and columns.

* **Pandas Series** --> each of those columns is called a pandas Series. Although we can have pandas Series that are not part of a DataFrame, most of the time they are.

In [2]:
ufo = pd.read_table('http://bit.ly/uforeports', sep=',')
ufo.head()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
1,Willingboro,,OTHER,NJ,6/30/1930 20:00
2,Holyoke,,OVAL,CO,2/15/1931 14:00
3,Abilene,,DISK,KS,6/1/1931 13:00
4,New York Worlds Fair,,LIGHT,NY,4/18/1933 19:00


In this case, as the file is .csv it is better to use another function, which has sep=',' as default:

In [6]:
ufo = pd.read_csv('http://bit.ly/uforeports')
ufo.head()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
1,Willingboro,,OTHER,NJ,6/30/1930 20:00
2,Holyoke,,OVAL,CO,2/15/1931 14:00
3,Abilene,,DISK,KS,6/1/1931 13:00
4,New York Worlds Fair,,LIGHT,NY,4/18/1933 19:00


In [7]:
# Lets confirm this is a DataFrame.

type(ufo)

pandas.core.frame.DataFrame

Select a Series with bracket notation:
    - It's case sensitive!

In [8]:
ufo['City']

0                      Ithaca
1                 Willingboro
2                     Holyoke
3                     Abilene
4        New York Worlds Fair
                 ...         
18236              Grant Park
18237             Spirit Lake
18238             Eagle River
18239             Eagle River
18240                    Ybor
Name: City, Length: 18241, dtype: object

In [13]:
# Lets confirm it is a Series.

type(ufo['State'])

pandas.core.series.Series

In [14]:
# Shortcut to that bracket notation (saves time).

ufo.City

0                      Ithaca
1                 Willingboro
2                     Holyoke
3                     Abilene
4        New York Worlds Fair
                 ...         
18236              Grant Park
18237             Spirit Lake
18238             Eagle River
18239             Eagle River
18240                    Ybor
Name: City, Length: 18241, dtype: object

This is possible because in pandas, everytime a Series is added to a DataFrame, its name automatically becomes an attribute of that DataFrame.

**Can't use the dot notation if the name is made up of separated words!**

**The same happens if the name of the column is something that conflicts with a build-in method or attribute.**

In [15]:
ufo['Colors Reported']

0        NaN
1        NaN
2        NaN
3        NaN
4        NaN
        ... 
18236    NaN
18237    NaN
18238    NaN
18239    RED
18240    NaN
Name: Colors Reported, Length: 18241, dtype: object

### BONUS TIP! 

How do we create a new pandas Series in a DataFrame?

   - **Bracket notation needed!**

In [18]:
ufo['Location'] = ufo.City + ', ' + ufo.State

In [20]:
ufo.head()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time,Location
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00,"Ithaca, NY"
1,Willingboro,,OTHER,NJ,6/30/1930 20:00,"Willingboro, NJ"
2,Holyoke,,OVAL,CO,2/15/1931 14:00,"Holyoke, CO"
3,Abilene,,DISK,KS,6/1/1931 13:00,"Abilene, KS"
4,New York Worlds Fair,,LIGHT,NY,4/18/1933 19:00,"New York Worlds Fair, NY"
