In [1]:
import pandas as pd

### Pandas Unique

Pandas Unique will show you the unique values within your dataset or Series. This is very useful when you're trying to understand the cardinality (*how many* elements) in a group.

Let's run through an example
1. Find the unique values within a Pandas column

And one application
1. Iterate through a DataFrame's column's unique values. Then filter the DataFrame and do something with your data

Let's first create a DataFrame

In [6]:
df = pd.DataFrame([('Foreign Cinema', 'Restaurant', 289.0),
                   ('Liho Liho', 'Restaurant', 224.0),
                   ('500 Club', 'Bar', 80.5),
                   ('The Square', 'Bar', 19.34),
                   ('The Square', 'Bar', 29.30),
                   ('Foreign Cinema', 'Restaurant', 340.03),
                   ('500 Club', 'Bar', 50.7),
                   ('500 Club', 'Bar', 45.2),],
           columns=('name', 'type', 'AvgBill')
                 )
df

Unnamed: 0,name,type,AvgBill
0,Foreign Cinema,Restaurant,289.0
1,Liho Liho,Restaurant,224.0
2,500 Club,Bar,80.5
3,The Square,Bar,19.34
4,The Square,Bar,29.3
5,Foreign Cinema,Restaurant,340.03
6,500 Club,Bar,50.7
7,500 Club,Bar,45.2


### 1. Find the unique values within a Pandas column

Say you want to find the unique values within a Pandas Column. All you need to do is call .unique() on the column you're interested in. Let's first find the unique values within 'name', then within 'type'

In [4]:
df['name'].unique()

array(['Foreign Cinema', 'Liho Liho', '500 Club', 'The Square'],
      dtype=object)

In [5]:
df['type'].unique()

array(['Restaurant', 'bar'], dtype=object)

### Application: 1. Iterate through a DataFrame's column's unique values. Then filter the DataFrame and do something with your data

Often times I'll use .unique() when I want to iterate through the subsets of my DataFrame. Here I'm going to iterate through the unique values within the 'name' column and find the sum of the 'AvgBill' column.

There are more efficient ways of doing this, but we'll use this as the demonstration.

In [7]:
unique_values = df['name'].unique()
unique_values

array(['Foreign Cinema', 'Liho Liho', '500 Club', 'The Square'],
      dtype=object)

In [9]:
for rezy in unique_values:
    df_single_rezy = df[df['name']==rezy]
    
    print ("Your total bill for {} is {}".format(rezy, df_single_rezy['AvgBill'].sum()))

Your total bill for Foreign Cinema is 629.03
Your total bill for Liho Liho is 224.0
Your total bill for 500 Club is 176.39999999999998
Your total bill for The Square is 48.64
