## Data Analytics using pandas

In [1]:
# importing library

import pandas as pd

In [60]:
# importing music file

music = pd.read_csv('music.csv')

### Basics Filtering

In [61]:
# filter music who's artist is from UK

uk_artist = music[music['country'] == 'UK']
uk_artist

Unnamed: 0,artist,country,plays,genre
0,The Beatles,UK,150,rock
1,Pink Floyd,UK,10000,rock


In [62]:
# another way

music[music.country == 'UK']

Unnamed: 0,artist,country,plays,genre
0,The Beatles,UK,150,rock
1,Pink Floyd,UK,10000,rock


### Multiple filters

We want to filter the “music” DataFrame to select only the rock genre who have 200 plays or more.

In [63]:
rock_200 = music[(music['genre'] == 'rock') & (music['plays'] >= 200)]
rock_200

Unnamed: 0,artist,country,plays,genre
1,Pink Floyd,UK,10000,rock
3,Cairokee,Egypt,200,rock
4,ACDC,US,250,rock
5,The Doors,US,1000,rock
6,Poets of The Fall,Finland,250,rock


In [64]:
rock_200_250 = music[(music['genre'] == 'rock') & (music['plays'] > 200) & (music['plays'] <= 500)]
rock_200_250

Unnamed: 0,artist,country,plays,genre
4,ACDC,US,250,rock
6,Poets of The Fall,Finland,250,rock


### Negation

you are trying to filter artists outside the UK

In [65]:
# one way

outside_uk = music[~(music['country'] == 'UK')]
outside_uk

Unnamed: 0,artist,country,plays,genre
2,Metallica,US,500,metal
3,Cairokee,Egypt,200,rock
4,ACDC,US,250,rock
5,The Doors,US,1000,rock
6,Poets of The Fall,Finland,250,rock


In [66]:
# another way

outside_uk = music[music['country'] != 'UK']
outside_uk

Unnamed: 0,artist,country,plays,genre
2,Metallica,US,500,metal
3,Cairokee,Egypt,200,rock
4,ACDC,US,250,rock
5,The Doors,US,1000,rock
6,Poets of The Fall,Finland,250,rock


## Challenge: Multiple Filters

Problem definition#
A music label wants to evaluate the success of its artists in the past month. However, it is unfair to evaluate based on play count across different countries. The music label would like to view at the same time:

Artists outside the UK who have > 100 plays
Artists inside the UK who have > 200 plays

In [67]:
out = music[(~(music['country'] == 'UK') & (music['plays'] >100)) | ((music['country'] == 'UK') & (music['plays'] > 200))]
out['artist']

1           Pink Floyd
2            Metallica
3             Cairokee
4                 ACDC
5            The Doors
6    Poets of The Fall
Name: artist, dtype: object

### Filtering by a List/String Filters

In this example, you want to filter artists who originate from either the US or the UK

In [68]:
country_list = list(['US','UK'])
out = music[music['country'].isin(country_list)]
out

Unnamed: 0,artist,country,plays,genre
0,The Beatles,UK,150,rock
1,Pink Floyd,UK,10000,rock
2,Metallica,US,500,metal
4,ACDC,US,250,rock
5,The Doors,US,1000,rock


Example: Filtering artists whose name starts with `The`

In [69]:
out = music[music['artist'].str.startswith('The')]
out

Unnamed: 0,artist,country,plays,genre
0,The Beatles,UK,150,rock
5,The Doors,US,1000,rock


In [70]:
music[music['artist'].str.contains('Met')]

Unnamed: 0,artist,country,plays,genre
2,Metallica,US,500,metal


Your music data analyst is getting more curious; they want to know a list of all artists from the UK or Finland whose name contains the word The.

In [71]:
artist_list = list(['UK','Finland'])
out = music[(music['artist'].isin(artist_list)) | (music['artist'].str.contains('The'))]
out

Unnamed: 0,artist,country,plays,genre
0,The Beatles,UK,150,rock
5,The Doors,US,1000,rock
6,Poets of The Fall,Finland,250,rock


### Problem Definition
<br> Your music analyst is getting more selective about the countries of origin of artists. They want to exclude artists from the UK or Finland, with the exception of still returning artists who have >= 10000 plays.

In [84]:
country_list = ['UK','Finland']
out = music[~(music['country'].isin(country_list)) | (music['plays'] >= 10000)]
list(out['artist'].values)

['Pink Floyd', 'Metallica', 'Cairokee', 'ACDC', 'The Doors']

In [78]:
music

Unnamed: 0,artist,country,plays,genre
0,The Beatles,UK,150,rock
1,Pink Floyd,UK,10000,rock
2,Metallica,US,500,metal
3,Cairokee,Egypt,200,rock
4,ACDC,US,250,rock
5,The Doors,US,1000,rock
6,Poets of The Fall,Finland,250,rock
