# Summary
* This notebook shows results from a model in text_analysis.ipynb.
* The model used is a classifier built by [huggingface](https://huggingface.co/facebook/bart-large-mnli).
* The model takes ~30 minutes to run. Therefore, for easy reference, the results of the model were downloaded as a csv file and loaded here.
* The results are in huggingface_results.csv, a file that sits in the [data folder](https://github.com/ds5110/stinky/tree/master/data)

# To update:
* If huggingface_results.csv is changed, this notebook will show the updated results after running all cells (Runtime --> Run all).

In [1]:
# load df_stinky_tagged_1
import pandas as pd

url = 'https://raw.githubusercontent.com/ds5110/stinky/master/smell_data/smell_intermediary_files/huggingface_results.csv'
huggingface_results = pd.read_csv(url)

In [2]:
huggingface_results

Unnamed: 0,Id,Report Source,Category,Created at local,Closed at local,Status,Address,smell description,URL,Latitude,Longitude,Export tagged places,date,time,Day,Month,Year,Hour,Month_name,Date & time (hour rounded),epoch time,date & time,smell value,zipcode,symptoms,additional comments,smell_type
0,7181157.0,iPhone,Odor,1/7/20 8:26,1/7/20 9:20,Archived,315 Spring Street,Petroleum smell coming from south portland,https://crm.seeclickfix.com/#/organizations/61...,43.647740,-70.269455,City Council District 2,1/7/20,8:26:00,7,1,2020,8,Jan,1/7/20 8:00,,,,,,,petroleum
1,7181402.0,Android,Odor,1/7/20 9:11,1/7/20 9:20,Archived,25 Cushman St,usual petroleum,https://crm.seeclickfix.com/#/organizations/61...,43.649448,-70.268626,City Council District 2,1/7/20,9:11:00,7,1,2020,9,Jan,1/7/20 9:00,,,,,,,petroleum
2,7192000.0,Android,Odor,1/9/20 7:14,1/9/20 8:45,Archived,25 Cushman St,usual petroleum,https://crm.seeclickfix.com/#/organizations/61...,43.649448,-70.268626,City Council District 2,1/9/20,7:14:00,9,1,2020,7,Jan,1/9/20 7:00,,,,,,,petroleum
3,7206428.0,Android,Odor,1/13/20 8:22,1/13/20 9:09,Archived,25 Cushman St,worst yet,https://crm.seeclickfix.com/#/organizations/61...,43.649448,-70.268626,City Council District 2,1/13/20,8:22:00,13,1,2020,8,Jan,1/13/20 8:00,,,,,,,petroleum
4,7210067.0,Android,Odor,1/14/20 8:24,1/14/20 14:50,Archived,25 Cushman St,usual petroleum stink. Cushman and Reiche play...,https://crm.seeclickfix.com/#/organizations/61...,43.649448,-70.268626,City Council District 2,1/14/20,8:24:00,14,1,2020,8,Jan,1/14/20 8:00,,,,,,,petroleum
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2902,,,,,,,,,,43.651800,-70.273600,,7/22/21,19:10:56,22,7,2021,19,Jul,7/22/21 19:00,1.626995e+09,7/22/21 19:10,4.0,4102.0,,,no description provided
2903,,,,,,,,Tank fumes,,43.634000,-70.284900,,7/22/21,21:04:39,22,7,2021,21,Jul,7/22/21 21:00,1.627002e+09,7/22/21 21:04,5.0,4106.0,,,petroleum
2904,,,,,,,,Petroleum smell most nites at 2am!!,,43.642800,-70.245200,,7/23/21,2:05:15,23,7,2021,2,Jul,7/23/21 2:00,1.627020e+09,7/23/21 2:05,5.0,4106.0,,,petroleum
2905,,,,,,,,Tar,,43.632500,-70.273100,,7/23/21,8:28:44,23,7,2021,8,Jul,7/23/21 8:00,1.627043e+09,7/23/21 8:28,3.0,4106.0,,,petroleum


In [3]:
huggingface_results['smell_type'].unique()

array(['petroleum', 'no description provided'], dtype=object)

In [4]:
huggingface_results.shape

(2907, 27)

In [5]:
print('{} descriptions are tagged as petroleum-related'.format(huggingface_results.loc[huggingface_results['smell_type']=='petroleum', 'smell description'].count()))

2308 descriptions are tagged as petroleum-related


In [7]:
huggingface_results.loc[huggingface_results['smell_type']=='no description provided']

Unnamed: 0,Id,Report Source,Category,Created at local,Closed at local,Status,Address,smell description,URL,Latitude,Longitude,Export tagged places,date,time,Day,Month,Year,Hour,Month_name,Date & time (hour rounded),epoch time,date & time,smell value,zipcode,symptoms,additional comments,smell_type
14,7492187.0,Portal,Odor,3/2/20 7:09,3/2/20 8:38,Archived,15 Thomas St,,https://crm.seeclickfix.com/#/organizations/61...,43.647399,-70.270398,City Council District 2,3/2/20,7:09:00,2,3,2020,7,Mar,3/2/20 7:00,,,,,,,no description provided
27,7566599.0,iPhone,Odor,3/19/20 6:58,3/20/20 8:03,Archived,52 Bowdoin St,,https://crm.seeclickfix.com/#/organizations/61...,43.646769,-70.274630,City Council District 2,3/19/20,6:58:00,19,3,2020,6,Mar,3/19/20 6:00,,,,,,,no description provided
28,7571927.0,iPhone,Odor,3/20/20 15:23,3/20/20 15:26,Archived,44 Bowdoin St,,https://crm.seeclickfix.com/#/organizations/61...,43.646678,-70.274327,City Council District 2,3/20/20,15:23:00,20,3,2020,15,Mar,3/20/20 15:00,,,,,,,no description provided
34,7658949.0,iPhone,Odor,4/12/20 13:49,4/13/20 6:58,Archived,25 Vaughan St,,https://crm.seeclickfix.com/#/organizations/61...,43.645280,-70.272406,City Council District 2,4/12/20,13:49:00,12,4,2020,13,Apr,4/12/20 13:00,,,,,,,no description provided
39,7687804.0,iPhone,Odor,4/19/20 16:03,4/20/20 12:18,Archived,389 Danforth St,,https://crm.seeclickfix.com/#/organizations/61...,43.644554,-70.271605,City Council District 2,4/19/20,16:03:00,19,4,2020,16,Apr,4/19/20 16:00,,,,,,,no description provided
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2888,,,,,,,,,,43.650200,-70.272800,,7/20/21,19:21:16,20,7,2021,19,Jul,7/20/21 19:00,1.626823e+09,7/20/21 19:21,5.0,4102.0,,,no description provided
2892,,,,,,,,,,43.636900,-70.258000,,7/21/21,11:58:49,21,7,2021,11,Jul,7/21/21 11:00,1.626883e+09,7/21/21 11:58,3.0,4106.0,,,no description provided
2895,,,,,,,,,,43.628400,-70.279700,,7/21/21,20:09:08,21,7,2021,20,Jul,7/21/21 20:00,1.626913e+09,7/21/21 20:09,5.0,4106.0,,,no description provided
2896,,,,,,,,,,43.621300,-70.278000,,7/21/21,22:42:58,21,7,2021,22,Jul,7/21/21 22:00,1.626922e+09,7/21/21 22:42,4.0,4106.0,,,no description provided
