# Selecting and Slicing Data

- Selecting by [rows](#Selecting-Rows) And [columns](#Selecting-Columns)
- Selecting by [criteria](#Selecting-Rows-By-Criteria)
- [Frequency counts](#Frequency-Counts)
- [Top 10](#Top-10-Most-Frequent-Values)
- Pivot tables
- [Turning columns into rows](#Melt) using melt
- Other ways of aggregating data

## Selecting Columns

There are a number of ways you can  select columns; we'll cover the 2 most common. You can put the name of the column in brackets and single quotes:

In [None]:
members['Member Status']

or you can use  dot notation:

In [None]:
members.Gender

The advantage of using dot notation is that you can use tab completion with it. 

To select several columns, give Pandas a list of the columns you want in brackets:

In [None]:
members[['Member Status','Gender']]

You can also select columns and rows at the same time.

## Selecting Rows By Number

For example, if you want the first 10 rows:

In [None]:
pastries[:10]

You can also select rows and columns at the same time. If you want the member status and gender for the first 10 rows:

In [None]:
members[['Member Status','Gender']][:10]

## Selecting Rows By Criteria

Pandas  also allows you to select  rooms using criteria. For example, to get every row where a family owns their home:

In [None]:
housing[ housing['Type'] == "Owner"]

To get every purchase of less than $10:

In [None]:
purchases[purchases['Price'] < 10]

You can also use & for AND and | for OR. For example, to get every family who owns their home and has more than three members:

In [None]:
housing[ housing['Type'] == "Owner" & housing['Household Members'] > 3 ]

## Frequency Counts

Use value_counts() to get a breakdown by a variable. For example, to find out how many of each type of pastry was purchased, from highest to lowest number purchased:

In [None]:
pastries['Type'].value_counts()

## Top 10 Most Frequent Values

To get the top 10, top 20, etc., use value_counts. Since value_counts returns the frequency count in order from largest to smallest, all you need to do is grab the number of rows you want. For example, to get the top 10 most frequently purchased type of pastries:

In [None]:
pastries['Type'].value_counts()[:10]

If you want to plot the top 5 on a horizontal bar chart using matplotlib:

In [None]:
pastries['Type'].value_counts()[:5].plot(kind='barh', title='Top 10 Pastries')

matplotlib automatically figures out that you want to make the column "Type" as the x axis and the frequency counts as the y axis.

## Turning Columns into Rows Using Melt

Suppose you have data on meat purchases over time (one of the sample DataFrames in ggplot):

In [1]:
from ggplot import *
meat

To convert some of the columns of meat purchases into rows:

In [3]:
meat_long = pd.melt(meat[['date', 'beef', 'pork', 'broilers']], id_vars='date', var_name='meat_type')
meat_long

Unnamed: 0,date,meat_type,value
0.0,1944-01-01,beef,751
1.0,1944-02-01,beef,713
2.0,1944-03-01,beef,741
3.0,1944-04-01,beef,650
4.0,1944-05-01,beef,681
5.0,1944-06-01,beef,658
6.0,1944-07-01,beef,662
7.0,1944-08-01,beef,787
8.0,1944-09-01,beef,774
9.0,1944-10-01,beef,834
