# 3 ways of filtering Pandas DataFrame

### How to chain multiple filters to manipulate a DataFrame? Go ahead and check out this article about [3 ways to filtering Pandas DataFrame](https://medium.com/swlh/3-ways-to-filter-pandas-dataframe-by-column-values-dfb6609b31de#37b9)

Standard imports

In [1]:
import pandas as pd

Sample Sales Dataset obtained from [Kaggle](https://www.kaggle.com/kyanyoga/sample-sales-data). An encoding parameter has been included to avoid Unicode Decode error when reading file 

In [2]:
sales_data = pd.read_csv('Dataset/sales_data.csv', encoding = "ISO-8859-1")

A view of the dataset

In [3]:
sales_data.head()

Unnamed: 0,ORDERNUMBER,QUANTITYORDERED,PRICEEACH,ORDERLINENUMBER,SALES,ORDERDATE,STATUS,QTR_ID,MONTH_ID,YEAR_ID,...,ADDRESSLINE1,ADDRESSLINE2,CITY,STATE,POSTALCODE,COUNTRY,TERRITORY,CONTACTLASTNAME,CONTACTFIRSTNAME,DEALSIZE
0,10107,30,95.7,2,2871.0,2/24/2003 0:00,Shipped,1,2,2003,...,897 Long Airport Avenue,,NYC,NY,10022.0,USA,,Yu,Kwai,Small
1,10121,34,81.35,5,2765.9,5/7/2003 0:00,Shipped,2,5,2003,...,59 rue de l'Abbaye,,Reims,,51100.0,France,EMEA,Henriot,Paul,Small
2,10134,41,94.74,2,3884.34,7/1/2003 0:00,Shipped,3,7,2003,...,27 rue du Colonel Pierre Avia,,Paris,,75508.0,France,EMEA,Da Cunha,Daniel,Medium
3,10145,45,83.26,6,3746.7,8/25/2003 0:00,Shipped,3,8,2003,...,78934 Hillside Dr.,,Pasadena,CA,90003.0,USA,,Young,Julie,Medium
4,10159,49,100.0,14,5205.27,10/10/2003 0:00,Shipped,4,10,2003,...,7734 Strong St.,,San Francisco,CA,,USA,,Brown,Julie,Medium


## Exploratory Data Analysis

In [4]:
sales_data.shape

(2823, 25)

Take a look at the columns in the dataset

In [5]:
sales_data.columns

Index(['ORDERNUMBER', 'QUANTITYORDERED', 'PRICEEACH', 'ORDERLINENUMBER',
       'SALES', 'ORDERDATE', 'STATUS', 'QTR_ID', 'MONTH_ID', 'YEAR_ID',
       'PRODUCTLINE', 'MSRP', 'PRODUCTCODE', 'CUSTOMERNAME', 'PHONE',
       'ADDRESSLINE1', 'ADDRESSLINE2', 'CITY', 'STATE', 'POSTALCODE',
       'COUNTRY', 'TERRITORY', 'CONTACTLASTNAME', 'CONTACTFIRSTNAME',
       'DEALSIZE'],
      dtype='object')

What are the deal sizes?

In [6]:
sales_data['DEALSIZE'].value_counts()

Medium    1384
Small     1282
Large      157
Name: DEALSIZE, dtype: int64

How many kinds of status can a order have?

In [7]:
sales_data['STATUS'].value_counts()

Shipped       2617
Cancelled       60
Resolved        47
On Hold         44
In Process      41
Disputed        14
Name: STATUS, dtype: int64

In which countries are the sales made? How many orders are there for each country?

In [8]:
sales_data['COUNTRY'].value_counts()

USA            1004
Spain           342
France          314
Australia       185
UK              144
Italy           113
Finland          92
Norway           85
Singapore        79
Canada           70
Denmark          63
Germany          62
Sweden           57
Austria          55
Japan            52
Belgium          33
Switzerland      31
Philippines      26
Ireland          16
Name: COUNTRY, dtype: int64

## Data Preprocessing

Removing any null values in the dataset

In [9]:
sales_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2823 entries, 0 to 2822
Data columns (total 25 columns):
ORDERNUMBER         2823 non-null int64
QUANTITYORDERED     2823 non-null int64
PRICEEACH           2823 non-null float64
ORDERLINENUMBER     2823 non-null int64
SALES               2823 non-null float64
ORDERDATE           2823 non-null object
STATUS              2823 non-null object
QTR_ID              2823 non-null int64
MONTH_ID            2823 non-null int64
YEAR_ID             2823 non-null int64
PRODUCTLINE         2823 non-null object
MSRP                2823 non-null int64
PRODUCTCODE         2823 non-null object
CUSTOMERNAME        2823 non-null object
PHONE               2823 non-null object
ADDRESSLINE1        2823 non-null object
ADDRESSLINE2        302 non-null object
CITY                2823 non-null object
STATE               1337 non-null object
POSTALCODE          2747 non-null object
COUNTRY             2823 non-null object
TERRITORY           1749 non-null obje

'ADDRESSLINE2' attribute contains way too many null values and is trivial for this project, so removing this column.

In [10]:
sales_data.drop(['ADDRESSLINE2'], axis=1, inplace=True)

In [11]:
sales_data.isnull().sum()

ORDERNUMBER            0
QUANTITYORDERED        0
PRICEEACH              0
ORDERLINENUMBER        0
SALES                  0
ORDERDATE              0
STATUS                 0
QTR_ID                 0
MONTH_ID               0
YEAR_ID                0
PRODUCTLINE            0
MSRP                   0
PRODUCTCODE            0
CUSTOMERNAME           0
PHONE                  0
ADDRESSLINE1           0
CITY                   0
STATE               1486
POSTALCODE            76
COUNTRY                0
TERRITORY           1074
CONTACTLASTNAME        0
CONTACTFIRSTNAME       0
DEALSIZE               0
dtype: int64

Replacing all null values with 'Unknown' since 'STATE' and 'TERRITORY' columns are of type *String*

In [12]:
sales_data.fillna('Unknown', inplace=True)

Double checking if null values still persists

In [13]:
sales_data.isnull().values.any()

False

## 1) Filtering based on one condition:

There is a *DEALSIZE* column in this dataset which is either small or medium or large Let’s say we want to know the details of all the large deals. A simple way would be,

In [14]:
largedeals = sales_data[sales_data['DEALSIZE'] == 'Large']
largedeals

Unnamed: 0,ORDERNUMBER,QUANTITYORDERED,PRICEEACH,ORDERLINENUMBER,SALES,ORDERDATE,STATUS,QTR_ID,MONTH_ID,YEAR_ID,...,PHONE,ADDRESSLINE1,CITY,STATE,POSTALCODE,COUNTRY,TERRITORY,CONTACTLASTNAME,CONTACTFIRSTNAME,DEALSIZE
20,10341,41,100.00,9,7737.93,11/24/2004 0:00,Shipped,4,11,2004,...,6562-9555,Geislweg 14,Salzburg,Unknown,5020,Austria,EMEA,Pipps,Georg,Large
25,10417,66,100.00,2,7516.08,5/13/2005 0:00,Disputed,2,5,2005,...,(91) 555 94 44,"C/ Moralzarzal, 86",Madrid,Unknown,28034,Spain,EMEA,Freyre,Diego,Large
27,10112,29,100.00,1,7209.11,3/24/2003 0:00,Shipped,1,3,2003,...,0921-12 3555,Berguvsvgen 8,Lule,Unknown,S-958 22,Sweden,EMEA,Berglund,Christina,Large
28,10126,38,100.00,11,7329.06,5/28/2003 0:00,Shipped,2,5,2003,...,(91) 555 22 82,"C/ Araquil, 67",Madrid,Unknown,28023,Spain,EMEA,Sommer,Mart¡n,Large
29,10140,37,100.00,11,7374.10,7/24/2003 0:00,Shipped,3,7,2003,...,6505556809,9408 Furth Circle,Burlingame,CA,94217,USA,Unknown,Hirano,Juri,Large
30,10150,45,100.00,8,10993.50,9/19/2003 0:00,Shipped,3,9,2003,...,+65 221 7555,"Bronz Sok., Bronz Apt. 3/6 Tesvikiye",Singapore,Unknown,79903,Singapore,Japan,Natividad,Eric,Large
32,10174,34,100.00,4,8014.82,11/6/2003 0:00,Shipped,4,11,2003,...,61-7-3844-6555,31 Duncan St. West End,South Brisbane,Queensland,4101,Australia,APAC,Calaghan,Tony,Large
34,10194,42,100.00,11,7290.36,11/25/2003 0:00,Shipped,4,11,2003,...,78.32.5555,"2, rue du Commerce",Lyon,Unknown,69004,France,EMEA,Saveley,Mary,Large
35,10206,47,100.00,6,9064.89,12/5/2003 0:00,Shipped,4,12,2003,...,(604) 555-3392,1900 Oak St.,Vancouver,BC,V3F 2K1,Canada,Unknown,Tannamuri,Yoshi,Large
39,10258,32,100.00,6,7680.64,6/15/2004 0:00,Shipped,2,6,2004,...,+81 3 3584 0555,2-2-8 Roppongi,Minato-ku,Tokyo,106-0032,Japan,Japan,Shimamura,Akiko,Large


I find out that Madrid is the top-ranking city in terms of revenue. I’d like to compare the sales details of Madrid against all the other cities. This can be achieved by assigning conditions to variables.

In [15]:
madridSales = sales_data['CITY'] != 'Madrid'

withMadridSales, withoutMadridSales = sales_data[~madridSales], sales_data[madridSales]

In [16]:
withMadridSales

Unnamed: 0,ORDERNUMBER,QUANTITYORDERED,PRICEEACH,ORDERLINENUMBER,SALES,ORDERDATE,STATUS,QTR_ID,MONTH_ID,YEAR_ID,...,PHONE,ADDRESSLINE1,CITY,STATE,POSTALCODE,COUNTRY,TERRITORY,CONTACTLASTNAME,CONTACTFIRSTNAME,DEALSIZE
25,10417,66,100.00,2,7516.08,5/13/2005 0:00,Disputed,2,5,2005,...,(91) 555 94 44,"C/ Moralzarzal, 86",Madrid,Unknown,28034,Spain,EMEA,Freyre,Diego,Large
28,10126,38,100.00,11,7329.06,5/28/2003 0:00,Shipped,2,5,2003,...,(91) 555 22 82,"C/ Araquil, 67",Madrid,Unknown,28023,Spain,EMEA,Sommer,Mart¡n,Large
53,10424,50,100.00,6,12001.00,5/31/2005 0:00,In Process,2,5,2005,...,(91) 555 94 44,"C/ Moralzarzal, 86",Madrid,Unknown,28034,Spain,EMEA,Freyre,Diego,Large
79,10417,45,100.00,5,5887.35,5/13/2005 0:00,Disputed,2,5,2005,...,(91) 555 94 44,"C/ Moralzarzal, 86",Madrid,Unknown,28034,Spain,EMEA,Freyre,Diego,Medium
105,10417,56,100.00,4,9218.16,5/13/2005 0:00,Disputed,2,5,2005,...,(91) 555 94 44,"C/ Moralzarzal, 86",Madrid,Unknown,28034,Spain,EMEA,Freyre,Diego,Large
126,10350,26,75.47,5,1962.22,12/2/2004 0:00,Shipped,4,12,2004,...,(91) 555 94 44,"C/ Moralzarzal, 86",Madrid,Unknown,28034,Spain,EMEA,Freyre,Diego,Small
135,10126,22,100.00,4,3347.74,5/28/2003 0:00,Shipped,2,5,2003,...,(91) 555 22 82,"C/ Araquil, 67",Madrid,Unknown,28023,Spain,EMEA,Sommer,Mart¡n,Medium
169,10203,20,100.00,8,3930.40,12/2/2003 0:00,Shipped,4,12,2003,...,(91) 555 94 44,"C/ Moralzarzal, 86",Madrid,Unknown,28034,Spain,EMEA,Freyre,Diego,Medium
190,10153,20,100.00,11,4904.00,9/28/2003 0:00,Shipped,3,9,2003,...,(91) 555 94 44,"C/ Moralzarzal, 86",Madrid,Unknown,28034,Spain,EMEA,Freyre,Diego,Medium
197,10231,42,100.00,2,8378.58,3/19/2004 0:00,Shipped,1,3,2004,...,+34 913 728 555,"Merchants House, 27-30 Merchant's Quay",Madrid,Unknown,28023,Spain,EMEA,Fernandez,Jesus,Large


In [17]:
withoutMadridSales

Unnamed: 0,ORDERNUMBER,QUANTITYORDERED,PRICEEACH,ORDERLINENUMBER,SALES,ORDERDATE,STATUS,QTR_ID,MONTH_ID,YEAR_ID,...,PHONE,ADDRESSLINE1,CITY,STATE,POSTALCODE,COUNTRY,TERRITORY,CONTACTLASTNAME,CONTACTFIRSTNAME,DEALSIZE
0,10107,30,95.70,2,2871.00,2/24/2003 0:00,Shipped,1,2,2003,...,2125557818,897 Long Airport Avenue,NYC,NY,10022,USA,Unknown,Yu,Kwai,Small
1,10121,34,81.35,5,2765.90,5/7/2003 0:00,Shipped,2,5,2003,...,26.47.1555,59 rue de l'Abbaye,Reims,Unknown,51100,France,EMEA,Henriot,Paul,Small
2,10134,41,94.74,2,3884.34,7/1/2003 0:00,Shipped,3,7,2003,...,+33 1 46 62 7555,27 rue du Colonel Pierre Avia,Paris,Unknown,75508,France,EMEA,Da Cunha,Daniel,Medium
3,10145,45,83.26,6,3746.70,8/25/2003 0:00,Shipped,3,8,2003,...,6265557265,78934 Hillside Dr.,Pasadena,CA,90003,USA,Unknown,Young,Julie,Medium
4,10159,49,100.00,14,5205.27,10/10/2003 0:00,Shipped,4,10,2003,...,6505551386,7734 Strong St.,San Francisco,CA,Unknown,USA,Unknown,Brown,Julie,Medium
5,10168,36,96.66,1,3479.76,10/28/2003 0:00,Shipped,4,10,2003,...,6505556809,9408 Furth Circle,Burlingame,CA,94217,USA,Unknown,Hirano,Juri,Medium
6,10180,29,86.13,9,2497.77,11/11/2003 0:00,Shipped,4,11,2003,...,20.16.1555,"184, chausse de Tournai",Lille,Unknown,59000,France,EMEA,Rance,Martine,Small
7,10188,48,100.00,1,5512.32,11/18/2003 0:00,Shipped,4,11,2003,...,+47 2267 3215,"Drammen 121, PR 744 Sentrum",Bergen,Unknown,N 5804,Norway,EMEA,Oeztan,Veysel,Medium
8,10201,22,98.57,2,2168.54,12/1/2003 0:00,Shipped,4,12,2003,...,6505555787,5557 North Pendale Street,San Francisco,CA,Unknown,USA,Unknown,Murphy,Julie,Small
9,10211,41,100.00,14,4708.44,1/15/2004 0:00,Shipped,1,1,2004,...,(1) 47.55.6555,"25, rue Lauriston",Paris,Unknown,75016,France,EMEA,Perrier,Dominique,Medium


## 2) Filtering based on Multiple Conditions:

Let’s see if we can find all the countries where the order is on hold in the year 2005

In [18]:
sales_data.loc[(sales_data['STATUS'] == 'On Hold') & (sales_data['YEAR_ID'] == 2005), 'COUNTRY']

132     USA
550     USA
598     USA
700     USA
802     USA
879     USA
960     USA
1012    USA
1089    USA
1287    USA
1314    USA
1413    USA
1436    USA
1514    USA
1537    USA
1615    USA
1714    USA
1790    USA
1844    USA
1866    USA
1942    USA
1969    USA
2022    USA
2120    USA
2195    USA
2325    USA
2376    USA
2482    USA
2507    USA
2533    USA
2586    USA
2613    USA
2664    USA
2689    USA
2716    USA
2742    USA
2768    USA
2822    USA
Name: COUNTRY, dtype: object

The list of conditions to be performed upon the DataFrame can increase drastically. Let’s consider a use case. 
I find out that Spain ranks second in generating total revenue, see if there any orders in Spain where the Sales didn’t cross 5000 and the Quantityordered is less than 50. This can be done in two ways:

### 1) Either hard-code the list

In [19]:
cond1 = sales_data['COUNTRY'] == 'Spain'
cond2 = sales_data['QUANTITYORDERED'] <= 50
cond3 = sales_data['SALES'] <= 5000
allcond = cond1 & cond2 & cond3

In [20]:
sales_data[allcond]

Unnamed: 0,ORDERNUMBER,QUANTITYORDERED,PRICEEACH,ORDERLINENUMBER,SALES,ORDERDATE,STATUS,QTR_ID,MONTH_ID,YEAR_ID,...,PHONE,ADDRESSLINE1,CITY,STATE,POSTALCODE,COUNTRY,TERRITORY,CONTACTLASTNAME,CONTACTFIRSTNAME,DEALSIZE
126,10350,26,75.47,5,1962.22,12/2/2004 0:00,Shipped,4,12,2004,...,(91) 555 94 44,"C/ Moralzarzal, 86",Madrid,Unknown,28034,Spain,EMEA,Freyre,Diego,Small
135,10126,22,100.00,4,3347.74,5/28/2003 0:00,Shipped,2,5,2003,...,(91) 555 22 82,"C/ Araquil, 67",Madrid,Unknown,28023,Spain,EMEA,Sommer,Mart¡n,Medium
169,10203,20,100.00,8,3930.40,12/2/2003 0:00,Shipped,4,12,2003,...,(91) 555 94 44,"C/ Moralzarzal, 86",Madrid,Unknown,28034,Spain,EMEA,Freyre,Diego,Medium
190,10153,20,100.00,11,4904.00,9/28/2003 0:00,Shipped,3,9,2003,...,(91) 555 94 44,"C/ Moralzarzal, 86",Madrid,Unknown,28034,Spain,EMEA,Freyre,Diego,Medium
206,10348,48,52.36,8,2513.28,11/1/2004 0:00,Shipped,4,11,2004,...,(91) 555 22 82,"C/ Araquil, 67",Madrid,Unknown,28023,Spain,EMEA,Sommer,Mart¡n,Small
214,10126,21,100.00,8,2439.57,5/28/2003 0:00,Shipped,2,5,2003,...,(91) 555 22 82,"C/ Araquil, 67",Madrid,Unknown,28023,Spain,EMEA,Sommer,Mart¡n,Small
265,10417,21,100.00,1,3447.78,5/13/2005 0:00,Disputed,2,5,2005,...,(91) 555 94 44,"C/ Moralzarzal, 86",Madrid,Unknown,28034,Spain,EMEA,Freyre,Diego,Medium
286,10348,47,100.00,4,4801.52,11/1/2004 0:00,Shipped,4,11,2004,...,(91) 555 22 82,"C/ Araquil, 67",Madrid,Unknown,28023,Spain,EMEA,Sommer,Mart¡n,Medium
287,10358,49,55.34,5,2711.66,12/10/2004 0:00,Shipped,4,12,2004,...,(91) 555 94 44,"C/ Moralzarzal, 86",Madrid,Unknown,28034,Spain,EMEA,Freyre,Diego,Small
299,10203,20,100.00,6,2254.80,12/2/2003 0:00,Shipped,4,12,2003,...,(91) 555 94 44,"C/ Moralzarzal, 86",Madrid,Unknown,28034,Spain,EMEA,Freyre,Diego,Small


### 2) Or build a list that is dynamically evaluated based on the criteria

In [21]:
import functools
condList = [cond1, cond2, cond3]
allcond = functools.reduce(lambda x,y: x & y, condList)
sales_data[allcond]

Unnamed: 0,ORDERNUMBER,QUANTITYORDERED,PRICEEACH,ORDERLINENUMBER,SALES,ORDERDATE,STATUS,QTR_ID,MONTH_ID,YEAR_ID,...,PHONE,ADDRESSLINE1,CITY,STATE,POSTALCODE,COUNTRY,TERRITORY,CONTACTLASTNAME,CONTACTFIRSTNAME,DEALSIZE
126,10350,26,75.47,5,1962.22,12/2/2004 0:00,Shipped,4,12,2004,...,(91) 555 94 44,"C/ Moralzarzal, 86",Madrid,Unknown,28034,Spain,EMEA,Freyre,Diego,Small
135,10126,22,100.00,4,3347.74,5/28/2003 0:00,Shipped,2,5,2003,...,(91) 555 22 82,"C/ Araquil, 67",Madrid,Unknown,28023,Spain,EMEA,Sommer,Mart¡n,Medium
169,10203,20,100.00,8,3930.40,12/2/2003 0:00,Shipped,4,12,2003,...,(91) 555 94 44,"C/ Moralzarzal, 86",Madrid,Unknown,28034,Spain,EMEA,Freyre,Diego,Medium
190,10153,20,100.00,11,4904.00,9/28/2003 0:00,Shipped,3,9,2003,...,(91) 555 94 44,"C/ Moralzarzal, 86",Madrid,Unknown,28034,Spain,EMEA,Freyre,Diego,Medium
206,10348,48,52.36,8,2513.28,11/1/2004 0:00,Shipped,4,11,2004,...,(91) 555 22 82,"C/ Araquil, 67",Madrid,Unknown,28023,Spain,EMEA,Sommer,Mart¡n,Small
214,10126,21,100.00,8,2439.57,5/28/2003 0:00,Shipped,2,5,2003,...,(91) 555 22 82,"C/ Araquil, 67",Madrid,Unknown,28023,Spain,EMEA,Sommer,Mart¡n,Small
265,10417,21,100.00,1,3447.78,5/13/2005 0:00,Disputed,2,5,2005,...,(91) 555 94 44,"C/ Moralzarzal, 86",Madrid,Unknown,28034,Spain,EMEA,Freyre,Diego,Medium
286,10348,47,100.00,4,4801.52,11/1/2004 0:00,Shipped,4,11,2004,...,(91) 555 22 82,"C/ Araquil, 67",Madrid,Unknown,28023,Spain,EMEA,Sommer,Mart¡n,Medium
287,10358,49,55.34,5,2711.66,12/10/2004 0:00,Shipped,4,12,2004,...,(91) 555 94 44,"C/ Moralzarzal, 86",Madrid,Unknown,28034,Spain,EMEA,Freyre,Diego,Small
299,10203,20,100.00,6,2254.80,12/2/2003 0:00,Shipped,4,12,2003,...,(91) 555 94 44,"C/ Moralzarzal, 86",Madrid,Unknown,28034,Spain,EMEA,Freyre,Diego,Small


## 3) Implementing the If...then constructs:

A most common format of problem solving is,

       if cond1:
            exp1
        elif cond2:
            exp2
        else:
            exp3
            
This can be approached in multiple ways using Pandas

### 1) Define a function that executes this logic and apply that to all columns in a DataFrame

In [22]:
def assignIncome(sales):
    if sales > 10000:
        return 'Gain' 
    elif sales > 5000 and sales < 10000:
        return 'No change'
    else:
        return 'Loss'

sales_data['Income Statement'] = sales_data['SALES'].apply(assignIncome)

### 2) Using a lambda function

In [23]:
sales_data['Income Statement'] = sales_data['SALES'].apply(lambda x: 'Gain' 
                                                           if x>10000 else ('No Change' 
                                                                            if x > 5000 and x < 10000 
                                                                            else 'Loss'))


### 3) Implementing a loop can be faster than .apply

In [24]:
sales_data['Income Statement'] = ['Gain' if x>10000 
                                  else ('No change' if x > 5000 and x < 10000 else 'Loss') 
                                  for x in sales_data['SALES']]

Let's take a look at the DataFrame now.

In [25]:
sales_data.head(10)

Unnamed: 0,ORDERNUMBER,QUANTITYORDERED,PRICEEACH,ORDERLINENUMBER,SALES,ORDERDATE,STATUS,QTR_ID,MONTH_ID,YEAR_ID,...,ADDRESSLINE1,CITY,STATE,POSTALCODE,COUNTRY,TERRITORY,CONTACTLASTNAME,CONTACTFIRSTNAME,DEALSIZE,Income Statement
0,10107,30,95.7,2,2871.0,2/24/2003 0:00,Shipped,1,2,2003,...,897 Long Airport Avenue,NYC,NY,10022,USA,Unknown,Yu,Kwai,Small,Loss
1,10121,34,81.35,5,2765.9,5/7/2003 0:00,Shipped,2,5,2003,...,59 rue de l'Abbaye,Reims,Unknown,51100,France,EMEA,Henriot,Paul,Small,Loss
2,10134,41,94.74,2,3884.34,7/1/2003 0:00,Shipped,3,7,2003,...,27 rue du Colonel Pierre Avia,Paris,Unknown,75508,France,EMEA,Da Cunha,Daniel,Medium,Loss
3,10145,45,83.26,6,3746.7,8/25/2003 0:00,Shipped,3,8,2003,...,78934 Hillside Dr.,Pasadena,CA,90003,USA,Unknown,Young,Julie,Medium,Loss
4,10159,49,100.0,14,5205.27,10/10/2003 0:00,Shipped,4,10,2003,...,7734 Strong St.,San Francisco,CA,Unknown,USA,Unknown,Brown,Julie,Medium,No change
5,10168,36,96.66,1,3479.76,10/28/2003 0:00,Shipped,4,10,2003,...,9408 Furth Circle,Burlingame,CA,94217,USA,Unknown,Hirano,Juri,Medium,Loss
6,10180,29,86.13,9,2497.77,11/11/2003 0:00,Shipped,4,11,2003,...,"184, chausse de Tournai",Lille,Unknown,59000,France,EMEA,Rance,Martine,Small,Loss
7,10188,48,100.0,1,5512.32,11/18/2003 0:00,Shipped,4,11,2003,...,"Drammen 121, PR 744 Sentrum",Bergen,Unknown,N 5804,Norway,EMEA,Oeztan,Veysel,Medium,No change
8,10201,22,98.57,2,2168.54,12/1/2003 0:00,Shipped,4,12,2003,...,5557 North Pendale Street,San Francisco,CA,Unknown,USA,Unknown,Murphy,Julie,Small,Loss
9,10211,41,100.0,14,4708.44,1/15/2004 0:00,Shipped,1,1,2004,...,"25, rue Lauriston",Paris,Unknown,75016,France,EMEA,Perrier,Dominique,Medium,Loss


These three are more pandas-y ways of arriving at the solution. There are many other alternatives to arrive at the solution.

# Thank you!								 																			