# User improvement

This script groups entries made by users together and orders them from oldest to newest. This can facillitate future research exploring how citizen scientists improve over time.



# Downloading Data Procedure:

## Use:
Run the following cell if you would like to get data from the GLOBE API

## Procedure
Downloading data from the API follows these steps:
- Request the non GEOJSON data from the GLOBE API
- Get the results from the JSON and pass it into a pandas dataframe
- Unpack the "data" entry
- Merge the data onto the original dataframe
- Remove the "data" entry from the original dataframe

In [1]:
import pandas as pd
import requests

start_date = "2017-05-29"
end_date = "2020-05-31"
url = f"https://api.globe.gov/search/v1/measurement/protocol/measureddate/?protocols=mosquito_habitat_mapper&startdate={start_date}&enddate={end_date}&geojson=FALSE&sample=FALSE"

# downloads data from the GLOBE API
response = requests.get(url)

# Converts data into a useable dataframe
data = response.json()["results"]
        
temp_data_df = pd.DataFrame(data)

temp_data_df["data"]
data_df = pd.DataFrame(temp_data_df["data"].to_dict())
data_df = data_df.transpose()
temp_data_df = temp_data_df.join(data_df)
temp_data_df.drop(["data"], axis=1, inplace = True)
temp_data_df

Unnamed: 0,protocol,measuredDate,createDate,updateDate,publishDate,organizationId,organizationName,siteId,siteName,countryName,...,mosquitohabitatmapperComments,mosquitohabitatmapperMosquitoPupae,mosquitohabitatmapperWaterSourcePhotoUrls,mosquitohabitatmapperDataSource,mosquitohabitatmapperLarvaFullBodyPhotoUrls,mosquitohabitatmapperMeasurementLatitude,mosquitohabitatmapperLastIdentifyStage,mosquitohabitatmapperWaterSourceType,mosquitohabitatmapperMosquitoHabitatMapperId,mosquitohabitatmapperMeasurementLongitude
0,mosquito_habitat_mapper,2018-11-25,2020-01-25T18:09:52,2020-01-25T18:09:52,2020-02-14T20:29:11,13063641.0,GPM Satellite Mission,35785,18SUJ105472,United States,...,,False,https://data.globe.gov/system/photos/2018/11/2...,GLOBE Observer App,,39.2538,,container: artificial,5188,-77.1959
1,mosquito_habitat_mapper,2019-04-07,2020-01-25T18:24:27,2020-01-25T18:24:27,2020-03-20T22:19:48,13063641.0,GPM Satellite Mission,35785,18SUJ105472,United States,...,,False,https://data.globe.gov/system/photos/2019/04/0...,GLOBE Observer App,,39.2535,identify,container: artificial,10365,-77.196
2,mosquito_habitat_mapper,2019-04-07,2020-01-25T18:24:27,2020-01-25T18:24:27,2020-03-20T22:19:48,13063641.0,GPM Satellite Mission,35785,18SUJ105472,United States,...,,False,https://data.globe.gov/system/photos/2019/04/0...,GLOBE Observer App,,39.2536,identify,container: artificial,10360,-77.1956
3,mosquito_habitat_mapper,2019-05-29,2020-01-25T18:29:36,2020-01-25T18:29:36,2020-03-20T22:19:48,13063641.0,GPM Satellite Mission,35785,18SUJ105472,United States,...,,False,https://data.globe.gov/system/photos/2019/05/2...,GLOBE Observer App,https://data.globe.gov/system/photos/2019/05/2...,39.2542,identify-siphon-pecten,container: artificial,12424,-77.1962
4,mosquito_habitat_mapper,2019-08-04,2020-01-25T18:45:20,2020-01-25T18:45:20,2020-03-20T22:19:48,13063641.0,GPM Satellite Mission,35785,18SUJ105472,United States,...,,False,https://data.globe.gov/system/photos/2019/08/0...,GLOBE Observer App,,39.2536,identify,container: artificial,14822,-77.1957
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
22449,mosquito_habitat_mapper,2020-05-30,2020-05-30T18:25:03,2020-05-30T19:30:02,2020-08-26T21:23:29,14054356.0,lycee Thilmakha,200580,28PCB553673,Senegal,...,,True,https://data.globe.gov/system/photos/2020/05/3...,GLOBE Observer App,https://data.globe.gov/system/photos/2020/05/3...,15.0777,identify-siphon-shape,container: artificial,22775,-16.3463
22450,mosquito_habitat_mapper,2020-05-29,2020-06-03T07:30:03,2020-06-03T16:15:20,2020-08-26T21:23:29,19841715.0,Madagascar Citizen Science,201123,38KQE645064,,...,,False,https://data.globe.gov/system/photos/2020/05/2...,GLOBE Observer App,,-18.9168,identify,container: artificial,22806,47.5121
22451,mosquito_habitat_mapper,2020-05-06,2020-07-14T08:40:05,2020-07-14T13:40:07,2020-08-26T21:23:29,18306968.0,Taiwan Partnership Citizen Science,208771,51RUH399684,,...,,True,,GLOBE Observer App,https://data.globe.gov/system/photos/2020/05/0...,25.0235,identify-basal-tuft,container: artificial,24654,121.413
22452,mosquito_habitat_mapper,2020-05-31,2020-07-18T23:35:02,2020-07-18T23:35:02,2020-08-26T21:23:29,14054356.0,lycee Thilmakha,209660,28PDB055429,Senegal,...,,True,https://data.globe.gov/system/photos/2020/05/3...,GLOBE Observer App,,14.8591,identify-aedes-tuft,container: artificial,24880,-15.8784


# Sorting/Grouping Procedure:

- Groups entries with the same User ID
- Creates a list of the UserID's and the size of the groups
- Sorts the list by size of group in descending order
- Iterates through the list of sorted groups and takes each group, sorts the
observations by User ID, and then adds it to the final data frame

Result: a observations with entries that share the same user being grouped
into the same part of the spreadsheet. These groups also show the observations 
made by the user from oldest to latest. This will allow people to analyze the
improvement of certain users over time.

In [2]:
# creates groups by similar userID
groups = temp_data_df.groupby("mosquitohabitatmapperUserid")

In [3]:
# creates a list of users (sorted by amt of entries)
gb = groups.size()
gb = gb.sort_values(ascending = False)
gb = gb.reset_index()
gb

Unnamed: 0,mosquitohabitatmapperUserid,0
0,51045191,739
1,50985322,565
2,5284745,456
3,51046601,408
4,21748177,374
...,...,...
4899,52560189,1
4900,52560018,1
4901,52559362,1
4902,52559352,1


In [6]:
# for each of the userID's, it sorts their entries (oldest - newest) and adds to master dataframe
final_df = pd.DataFrame()
for userID in gb["mosquitohabitatmapperUserid"]:
    df = groups.get_group(userID).sort_values("measuredDate")
    final_df = final_df.append(df, ignore_index = True)

final_df

Unnamed: 0,protocol,measuredDate,createDate,updateDate,publishDate,organizationId,organizationName,siteId,siteName,countryName,...,mosquitohabitatmapperComments,mosquitohabitatmapperMosquitoPupae,mosquitohabitatmapperWaterSourcePhotoUrls,mosquitohabitatmapperDataSource,mosquitohabitatmapperLarvaFullBodyPhotoUrls,mosquitohabitatmapperMeasurementLatitude,mosquitohabitatmapperLastIdentifyStage,mosquitohabitatmapperWaterSourceType,mosquitohabitatmapperMosquitoHabitatMapperId,mosquitohabitatmapperMeasurementLongitude
0,mosquito_habitat_mapper,2018-12-15,2020-01-25T18:11:17,2020-01-25T18:11:17,2020-02-14T20:29:11,19383352.0,Senegal Citizen Science,143292,28PCB659632,,...,,True,https://data.globe.gov/system/photos/2018/12/1...,GLOBE Observer App,https://data.globe.gov/system/photos/2018/12/1...,15.041,identify-siphon-shape,container: artificial,6128,-16.2473
1,mosquito_habitat_mapper,2018-12-20,2020-01-25T18:11:17,2020-01-25T18:11:17,2020-02-14T20:29:11,19383352.0,Senegal Citizen Science,143292,28PCB659632,,...,,False,https://data.globe.gov/system/photos/2018/12/2...,GLOBE Observer App,https://data.globe.gov/system/photos/2018/12/2...,15.0408,identify-siphon-shape,container: artificial,6147,-16.2474
2,mosquito_habitat_mapper,2019-02-20,2020-01-25T18:16:36,2020-01-25T18:16:36,2020-03-20T22:19:48,,,146141,28PCB659632,,...,,True,https://data.globe.gov/system/photos/2019/02/2...,GLOBE Observer App,https://data.globe.gov/system/photos/2019/02/2...,15.0407,identify-siphon-shape,container: artificial,7635,-16.2473
3,mosquito_habitat_mapper,2019-02-24,2020-01-25T18:16:36,2020-01-25T18:16:36,2020-03-20T22:19:48,,,146141,28PCB659632,,...,,True,https://data.globe.gov/system/photos/2019/02/2...,GLOBE Observer App,https://data.globe.gov/system/photos/2019/02/2...,15.041,identify-siphon-shape,container: artificial,8009,-16.2473
4,mosquito_habitat_mapper,2019-03-02,2020-01-25T18:18:54,2020-01-25T18:18:54,2020-03-20T22:19:48,,,146141,28PCB659632,,...,,True,https://data.globe.gov/system/photos/2019/03/0...,GLOBE Observer App,https://data.globe.gov/system/photos/2019/03/0...,15.041,identify-siphon-shape,container: artificial,8066,-16.2473
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
22449,mosquito_habitat_mapper,2019-02-23,2020-01-25T18:16:36,2020-01-25T18:16:36,2020-03-20T22:19:48,17615655.0,Thailand Citizen Science,146285,47QQV895620,,...,,False,https://data.globe.gov/system/photos/2019/02/2...,GLOBE Observer App,https://data.globe.gov/system/photos/2019/02/2...,17.7268,identify-basal-tuft,container: artificial,7758,101.73
22450,mosquito_habitat_mapper,2019-02-23,2020-01-25T18:16:36,2020-01-25T18:16:36,2020-03-20T22:19:48,17615655.0,Thailand Citizen Science,146285,47QQV895620,,...,,False,https://data.globe.gov/system/photos/2019/02/2...,GLOBE Observer App,https://data.globe.gov/system/photos/2019/02/2...,17.7267,identify-siphon-shape,still: lake/pond/swamp,7732,101.73
22451,mosquito_habitat_mapper,2019-02-23,2020-01-25T18:16:36,2020-01-25T18:16:36,2020-03-20T22:19:48,17615655.0,Thailand Citizen Science,146286,47QQV894619,,...,,True,https://data.globe.gov/system/photos/2019/02/2...,GLOBE Observer App,https://data.globe.gov/system/photos/2019/02/2...,17.7261,identify-saddle-complete,container: artificial,7683,101.73
22452,mosquito_habitat_mapper,2019-02-23,2020-01-25T18:16:36,2020-01-25T18:16:36,2020-03-20T22:19:48,17615655.0,Thailand Citizen Science,146285,47QQV895620,,...,,True,https://data.globe.gov/system/photos/2019/02/2...,GLOBE Observer App,https://data.globe.gov/system/photos/2019/02/2...,17.7267,identify-saddle-complete,container: artificial,7682,101.73


In [5]:
# outputs to file
final_df.to_csv("User_List.csv")