## Note about Pandas DataFrames/Series

A [DataFrame](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html) is a collection of [Series](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.html);
The DataFrame is the way Pandas represents a table, and Series is the data-structure Pandas uses to represent a column.

Pandas is built on top of the [NumPy](http://www.numpy.org/) library, which in practice means that
most of the methods defined for NumPy Arrays apply to Pandas Series/DataFrames.

What makes Pandas so attractive is the powerful interface to access individual records
of the table, proper handling of missing values, and relational-databases operations
between DataFrames.

## Selecting values

To access a value at the position `[i,j]` of a DataFrame, we have two options, depending on
what is the meaning of `i` in use.    

Remember that a DataFrame provides an *index* as a way to identify the rows of the table;
a row, then, has a *position* inside the table as well as a *label*, which
uniquely identifies its *entry* in the DataFrame.

## Use `DataFrame.iloc[..., ...]` to select values by their (entry) position

We can specify location by numerical index (the same way we would specify the index of a string).

## Use `DataFrame.loc[..., ...]` to select values by their (entry) label.

We can specify location by row and column name, too. Sometimes, this is more intuitive when you are working with data.

## Practice with Selection of Individual Values

#### Write an expression to find the Per Capita GDP of Serbia in 2007.

## Use `:` on its own to mean all columns or all rows.

This is just the same way you would slice data in NumPy (as we did earlier today) or even in Python lists.

In Pandas, when we want to select all of the data in a single row, we actually don't need to specify using `:` that we want all columns, too.

When we want to select all of the data in a column, though, we *do* in fact need the `:` if we use `.loc`.

Interestingly, there are actually a couple of different ways that we can access the same column in a Pandas DataFrame.

## Practicing with Slicing

#### What if we only care about the data from 1962 to 1972 from Italy, Montenegro, Netherlands, Norway, and Poland?

## Results of slicing can be used in further operations.

*   Usually, we don't just print a slice.
*   All the statistical operators that work on entire dataframes
    work the same way on slices.

## Use comparisons to select data based on value.

*   Comparison is applied element by element.
*   Returns a similarly-shaped dataframe of `True` and `False`.

#### Let's take a look at places in our data slice from earlier and find where the GDP per capita was greater than $10,000.

## Select values or NaN using a Boolean mask.

A frame full of Booleans is sometimes called a *mask* because of how it can be used.

When we do this, we get the value where the mask is true, and NaN (Not a Number) where it is false, which is useful because when we do statistics on our data, NaN values are ignored.

## Select-Apply-Combine operations

Pandas vectorizing methods and grouping operations are features that provide users 
much flexibility to analyse their data.

For instance, let's say we want to have a clearer view on how the European countries 
split themselves according to their GDP.

1.  We may have a glance by splitting the countries in two groups during the years surveyed,
    those who presented a GDP *higher* than the European average and those with a *lower* GDP.
2.  We then estimate a *wealthy score* based on the historical (from 1962 to 2007) values,
    where we account how many times a country has participated in the groups of *lower* or *higher* GDP

Finally, for each group in the `wealth_scores` table, we can sum their (financial) contribution
across the years surveyed: