# 4- Filtering

Filtering is an essential tool for data analysis. With it you can select specific parts of a DataFrame, facilitating analyses of data by way of making the available data more relevant. Pandas is not different, being a tool built for data analysis.


In [1]:
# Importing pandas
import pandas as pd

In [2]:
# Creating people DF
people_df = pd.DataFrame(
    {
        'first': ['John', 'Paul', 'George', 'Ringo'],
        'last': ['Lennon', 'McCartney', 'Harrison', 'Starr'],
        'birthyear': [1940, 1942, 1943, 1940],
        'email': [
            'john.lennon@email.com',
            'paul.mccartney@email.com',
            'george.harrison@email.com',
            'ringo.starr@email.com',
        ],
    }
)

In [3]:
# Creating DFs for results
df = pd.read_csv(
    'stackoverflow-developer-survey-2019/survey_results_public.csv',
    index_col='Respondent',
)

## Using filters

To use filters, you basically use `[]` to access indexes from the DF (with a location function) and add a conditional there. There are ways to improve this, one way is to set the conditionals into a variable, making it reusable.

When you use conditionals as "indexes" in a DF, pandas returns a Series composed of boolean values in which that condition is met, and where it isn't.

Python's `and` and `or` keywords aren't supported in these conditionals, instead you need to use `|` or `&` for "or" and "and" respectively.


In [4]:
# Creating a filter for Lennon and ringo names
people_filter = (people_df['last'] == 'Lennon') | (people_df['first'] == 'Ringo')

In [10]:
# Creating a filter for people who worked with python
programming_language_filter = df['LanguageWorkedWith'].str.contains('Python', na=False)

In [None]:
# Showing people filter
people_filter

0     True
1    False
2    False
3     True
dtype: bool

In [11]:
# Showing programming language filter
programming_language_filter

Respondent
1         True
2         True
3        False
4         True
5         True
         ...  
88377    False
88601    False
88802    False
88816    False
88863    False
Name: LanguageWorkedWith, Length: 88883, dtype: bool

Again, to use a filter, simply use a location method (`loc` and `iloc`) and pass the filter as an argument in `[]`.


In [12]:
# Showing people that aren't filtered (~ =  not)
people_df.loc[~people_filter, 'email']

1     paul.mccartney@email.com
2    george.harrison@email.com
Name: email, dtype: object

In [13]:
# Showing people who worked with Python
df.loc[programming_language_filter]

Unnamed: 0_level_0,MainBranch,Hobbyist,OpenSourcer,OpenSource,Employment,Country,Student,EdLevel,UndergradMajor,EduOther,...,WelcomeChange,SONewContent,Age,Gender,Trans,Sexuality,Ethnicity,Dependents,SurveyLength,SurveyEase
Respondent,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,I am a student who is learning to code,Yes,Never,The quality of OSS and closed source software ...,"Not employed, and not looking for work",United Kingdom,No,Primary/elementary school,,"Taught yourself a new language, framework, or ...",...,Just as welcome now as I felt last year,Tech articles written by other developers;Indu...,14.0,Man,No,Straight / Heterosexual,,No,Appropriate in length,Neither easy nor difficult
2,I am a student who is learning to code,No,Less than once per year,The quality of OSS and closed source software ...,"Not employed, but looking for work",Bosnia and Herzegovina,"Yes, full-time","Secondary school (e.g. American high school, G...",,Taken an online course in programming or softw...,...,Just as welcome now as I felt last year,Tech articles written by other developers;Indu...,19.0,Man,No,Straight / Heterosexual,,No,Appropriate in length,Neither easy nor difficult
4,I am a developer by profession,No,Never,The quality of OSS and closed source software ...,Employed full-time,United States,No,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",Taken an online course in programming or softw...,...,Just as welcome now as I felt last year,Tech articles written by other developers;Indu...,22.0,Man,No,Straight / Heterosexual,White or of European descent,No,Appropriate in length,Easy
5,I am a developer by profession,Yes,Once a month or more often,"OSS is, on average, of HIGHER quality than pro...",Employed full-time,Ukraine,No,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",Taken an online course in programming or softw...,...,Just as welcome now as I felt last year,Tech meetups or events in your area;Courses on...,30.0,Man,No,Straight / Heterosexual,White or of European descent;Multiracial,No,Appropriate in length,Easy
8,I code primarily as a hobby,Yes,Less than once per year,"OSS is, on average, of HIGHER quality than pro...","Not employed, but looking for work",India,,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...","Taught yourself a new language, framework, or ...",...,A lot more welcome now than last year,Tech articles written by other developers;Indu...,24.0,Man,No,Straight / Heterosexual,,,Appropriate in length,Neither easy nor difficult
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
84539,,Yes,Less than once a month but more than once per ...,The quality of OSS and closed source software ...,Employed full-time,United Kingdom,"Yes, full-time","Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",Taken an online course in programming or softw...,...,Just as welcome now as I felt last year,Courses on technologies you're interested in,23.0,Woman,Yes,Bisexual,White or of European descent,No,Appropriate in length,Easy
85738,,Yes,Never,The quality of OSS and closed source software ...,"Not employed, and not looking for work",Brazil,"Yes, full-time","Secondary school (e.g. American high school, G...",,"Taught yourself a new language, framework, or ...",...,Just as welcome now as I felt last year,Industry news about technologies you're intere...,15.0,Man,No,Straight / Heterosexual,Hispanic or Latino/Latina;White or of European...,No,Too short,Easy
86566,,Yes,Less than once a month but more than once per ...,"OSS is, on average, of HIGHER quality than pro...",Retired,Switzerland,No,Some college/university study without earning ...,"A humanities discipline (ex. literature, histo...",Taken a part-time in-person course in programm...,...,Just as welcome now as I felt last year,Tech articles written by other developers;Cour...,74.0,Man,No,,White or of European descent,No,Appropriate in length,Easy
87739,,Yes,Less than once per year,"OSS is, on average, of HIGHER quality than pro...",Employed part-time,Czech Republic,"Yes, full-time","Master’s degree (MA, MS, M.Eng., MBA, etc.)","Computer science, computer engineering, or sof...","Taught yourself a new language, framework, or ...",...,Just as welcome now as I felt last year,,25.0,,No,Straight / Heterosexual,White or of European descent,No,Appropriate in length,Easy
