# World Health Sensor Project

**By Carol Calderon**

Jun 2023

## Objective

Using The Global Health Observatory OData API to fetch data from it's data collection. Analyze the data using data science techniques and create an interactive dashboard to visualize and explore the insights.

## Project steps:

1. **Select the Data Source:** Determine the specific OData API that provides the data needed for the analysis. Explore available public datasets. Determine the analysis goals.

2. **Understand the API:** Study the documentation of the OData API to understand its endpoints, query parameters, and how to retrieve data. Identify the available resources, data models, and relationships between entities.

3. **Retrieve and Preprocess Data:** Use the OData API to fetch the required data based on the analysis goals. Apply data preprocessing steps like cleaning, filtering, and transforming the data as necessary to prepare it for analysis.

3. **Analyze the Data:** Utilize data science techniques and libraries such as pandas, NumPy, or scikit-learn to perform exploratory data analysis (EDA), statistical analysis, visualizations, or machine learning tasks. Extract meaningful insights and identify patterns or correlations in the data.

4. **Choose the Visualization Tool:** Select the appropriate visualization tool like Matplotlib, Seaborn, or Plotly to create interactive and informative visualizations.

5. **Design the Dashboard:** Design the user interface and layout of the dashboard using libraries like Dash, Streamlit, or Tableau. Create an intuitive and user-friendly dashboard that allows users to interact with the visualizations, apply filters, and explore the data dynamically.

6. **Deploy the Dashboard:** Host the dashboard on a web server or a cloud platform to make it accessible online. This could involve deploying it as a web application or utilizing cloud hosting services like Heroku or AWS.

7. **Test and Iterate:** Test the functionality and usability of the dashboard, seeking feedback from potential users. Iterate on the design and make improvements based on user input and additional data analysis requirements.

## Expected Project Tools:
    
Data Source: 

- The Global Health Observatory <https://www.who.int/data/gho>
- OData API documentation: <https://www.who.int/data/gho/info/gho-odata-api>

Retrieve the data: 

- **requests** or **urllib** to make HTTP requests to the OData API and retrieve the data.
- Pandas: Perform data preprocessing, cleaning, filtering, and transformation using the powerful data manipulation capabilities of pandas.

Analyze the Data:

- Pandas: Utilize pandas for exploratory data analysis (EDA), data wrangling, and statistical analysis on the retrieved data.
- NumPy: Use NumPy for numerical computations and mathematical operations.
- Scikit-learn: Apply machine learning algorithms for predictive modeling, clustering, or classification tasks if required.

Visualization Tools:

- Matplotlib: Create static visualizations such as line plots, bar charts, and scatter plots.
- Seaborn: Build aesthetically pleasing and informative statistical visualizations.
- Plotly: Develop interactive and customizable visualizations, including interactive charts and dashboards.

Design the Dashboard:

- Dash: Build interactive web-based dashboards using Python and HTML components.
- Streamlit: Create custom web applications for data analysis and visualization.
- Tableau: Utilize Tableau's intuitive drag-and-drop interface for building interactive dashboards.

Deploy the Dashboard:

- Heroku: Host your dashboard as a web application using Heroku's cloud platform.
- AWS (Amazon Web Services): Deploy your dashboard on AWS using services like Amazon EC2 or AWS Elastic Beanstalk.

Test and Iterate:

- User feedback and testing: Gather feedback from potential users and testers to improve the functionality and user experience of your dashboard.
- Jupyter Notebook or documentation: Document the project, including the data analysis process, code, and key findings.

## Understanding the API

Study the documentation of the OData API to understand its endpoints, query parameters, and how to retrieve data. Identify the available resources, data models, and relationships between entities.

**Retrieving all Dimensions** 

In [1]:
import requests
import json

# Send a GET request to the API endpoint
url = "https://ghoapi.azureedge.net/api/Dimension"
response = requests.get(url)

# Check the response status code
if response.status_code == 200:
    resp = response.json()
    data = resp['value']
    # Process the data as needed
    for item in data:
        value = item
        print(value)
        # Process each item in the data

else:
    print("Request failed with status code:", response.status_code)


{'Code': 'ADVERTISINGTYPE', 'Title': 'SUBSTANCE_ABUSE_ADVERTISING_TYPES'}
{'Code': 'AGEGROUP', 'Title': 'Age Group'}
{'Code': 'ALCOHOLTYPE', 'Title': 'Beverage Types'}
{'Code': 'AMRGLASSCATEGORY', 'Title': 'AMR GLASS Category'}
{'Code': 'ARCHIVE', 'Title': 'Archive date'}
{'Code': 'ASSISTIVETECHBARRIER', 'Title': 'Barriers to accessing assistive products'}
{'Code': 'ASSISTIVETECHFUNDING', 'Title': 'Funding for assistive tech products'}
{'Code': 'ASSISTIVETECHPRODUCT', 'Title': 'Assistive technology product'}
{'Code': 'ASSISTIVETECHSATIACTIVITY', 'Title': 'Satisfaction with assistive products for different environments and activities'}
{'Code': 'ASSISTIVETECHSATISERVICE', 'Title': 'Satisfaction with assistive products and related services'}
{'Code': 'ASSISTIVETECHSOURCE', 'Title': 'Sources of assistive products'}
{'Code': 'ASSISTIVETECHSUBQUESTION', 'Title': 'Assistive technology subquestion'}
{'Code': 'ASSISTIVETECHTRAVELDISTANCE', 'Title': 'Travel distance to obtain assistive products

In [2]:
print(data)

[{'Code': 'ADVERTISINGTYPE', 'Title': 'SUBSTANCE_ABUSE_ADVERTISING_TYPES'}, {'Code': 'AGEGROUP', 'Title': 'Age Group'}, {'Code': 'ALCOHOLTYPE', 'Title': 'Beverage Types'}, {'Code': 'AMRGLASSCATEGORY', 'Title': 'AMR GLASS Category'}, {'Code': 'ARCHIVE', 'Title': 'Archive date'}, {'Code': 'ASSISTIVETECHBARRIER', 'Title': 'Barriers to accessing assistive products'}, {'Code': 'ASSISTIVETECHFUNDING', 'Title': 'Funding for assistive tech products'}, {'Code': 'ASSISTIVETECHPRODUCT', 'Title': 'Assistive technology product'}, {'Code': 'ASSISTIVETECHSATIACTIVITY', 'Title': 'Satisfaction with assistive products for different environments and activities'}, {'Code': 'ASSISTIVETECHSATISERVICE', 'Title': 'Satisfaction with assistive products and related services'}, {'Code': 'ASSISTIVETECHSOURCE', 'Title': 'Sources of assistive products'}, {'Code': 'ASSISTIVETECHSUBQUESTION', 'Title': 'Assistive technology subquestion'}, {'Code': 'ASSISTIVETECHTRAVELDISTANCE', 'Title': 'Travel distance to obtain assis

In [3]:
print(response.content)

b'{"@odata.context":"https://ghoapi.azureedge.net/api/$metadata#DIMENSION","value":[{"Code":"ADVERTISINGTYPE","Title":"SUBSTANCE_ABUSE_ADVERTISING_TYPES"},{"Code":"AGEGROUP","Title":"Age Group"},{"Code":"ALCOHOLTYPE","Title":"Beverage Types"},{"Code":"AMRGLASSCATEGORY","Title":"AMR GLASS Category"},{"Code":"ARCHIVE","Title":"Archive date"},{"Code":"ASSISTIVETECHBARRIER","Title":"Barriers to accessing assistive products"},{"Code":"ASSISTIVETECHFUNDING","Title":"Funding for assistive tech products"},{"Code":"ASSISTIVETECHPRODUCT","Title":"Assistive technology product"},{"Code":"ASSISTIVETECHSATIACTIVITY","Title":"Satisfaction with assistive products for different environments and activities"},{"Code":"ASSISTIVETECHSATISERVICE","Title":"Satisfaction with assistive products and related services"},{"Code":"ASSISTIVETECHSOURCE","Title":"Sources of assistive products"},{"Code":"ASSISTIVETECHSUBQUESTION","Title":"Assistive technology subquestion"},{"Code":"ASSISTIVETECHTRAVELDISTANCE","Title":

**Retrieving all Dimension Values for COUNTRY dimension**

In [4]:
# Send a GET request to the API endpoint
url = "https://ghoapi.azureedge.net/api/DIMENSION/COUNTRY/DimensionValues"
response = requests.get(url)

# Check the response status code
if response.status_code == 200:
    resp = response.json()
    data = resp['value']
    # Process the data as needed
    for item in data:
        value = item
        print(value)
        # Process each item in the data

else:
    print("Request failed with status code:", response.status_code)

{'Code': 'ABW', 'Title': 'Aruba', 'ParentDimension': 'REGION', 'Dimension': 'COUNTRY', 'ParentCode': 'AMR', 'ParentTitle': 'Americas'}
{'Code': 'AFG', 'Title': 'Afghanistan', 'ParentDimension': 'REGION', 'Dimension': 'COUNTRY', 'ParentCode': 'EMR', 'ParentTitle': 'Eastern Mediterranean'}
{'Code': 'AGO', 'Title': 'Angola', 'ParentDimension': 'REGION', 'Dimension': 'COUNTRY', 'ParentCode': 'AFR', 'ParentTitle': 'Africa'}
{'Code': 'AIA', 'Title': 'Anguilla', 'ParentDimension': 'REGION', 'Dimension': 'COUNTRY', 'ParentCode': 'AMR', 'ParentTitle': 'Americas'}
{'Code': 'ALB', 'Title': 'Albania', 'ParentDimension': 'REGION', 'Dimension': 'COUNTRY', 'ParentCode': 'EUR', 'ParentTitle': 'Europe'}
{'Code': 'AND', 'Title': 'Andorra', 'ParentDimension': 'REGION', 'Dimension': 'COUNTRY', 'ParentCode': 'EUR', 'ParentTitle': 'Europe'}
{'Code': 'ANT530', 'Title': 'SPATIAL_SYNONYM', 'ParentDimension': 'REGION', 'Dimension': 'COUNTRY', 'ParentCode': 'AMR', 'ParentTitle': 'Americas'}
{'Code': 'ANT532', 'T

**Retrieving all Indicators (list of indicators)** 

In [5]:
# Send a GET request to the API endpoint
url = "https://ghoapi.azureedge.net/api/Indicator"
response = requests.get(url)

# Check the response status code
if response.status_code == 200:
    resp = response.json()
    data = resp['value']
    # Process the data as needed
    for item in data:
        value = item
        print(value)
        # Process each item in the data

else:
    print("Request failed with status code:", response.status_code)

{'IndicatorCode': 'AIR_8', 'IndicatorName': 'Ambient air pollution attributable DALYs  in children under 5 years', 'Language': 'EN'}
{'IndicatorCode': 'AIR_90', 'IndicatorName': 'Ambient air pollution attributable DALYs  (per 100 000 population, age-standardized)', 'Language': 'EN'}
{'IndicatorCode': 'AMRGLASS_COORD03', 'IndicatorName': 'Nomination of National Focal Point (NFP)', 'Language': 'EN'}
{'IndicatorCode': 'AMRGLASS_QA03', 'IndicatorName': 'EQA provided to local laboratories participating in GLASS', 'Language': 'EN'}
{'IndicatorCode': 'AMRGLASS_SURVL02', 'IndicatorName': 'Number of National surveillance sites in each country providing data to GLASS: outpatient facility category', 'Language': 'EN'}
{'IndicatorCode': 'asfr1', 'IndicatorName': 'Adolescent fertility rate (per 1000 women aged 15-19 years)', 'Language': 'EN'}
{'IndicatorCode': 'ASSISTIVETECH_METNEED', 'IndicatorName': 'Prevalence of met need of assistive products (%)', 'Language': 'EN'}
{'IndicatorCode': 'Camp_airti

**Retrieving Indicators that contains in the name 'Tobacco' word** 

In [12]:
# Send a GET request to the API endpoint
url = "https://ghoapi.azureedge.net/api/Indicator?$filter=contains(IndicatorName,'Tobacco')"
response = requests.get(url)

# Check the response status code
if response.status_code == 200:
    resp = response.json()
    data = resp['value']
    # Process the data as needed
    for item in data:
        value = item
        print(value)
        # Process each item in the data

else:
    print("Request failed with status code:", response.status_code)

{'IndicatorCode': 'Camp_gov_prog', 'IndicatorName': 'Campaign was part of a comprehensive tobacco control programme', 'Language': 'EN'}
{'IndicatorCode': 'E14a_prod_tv_films', 'IndicatorName': 'Ban on appearance of tobacco products in TV and/or films', 'Language': 'EN'}
{'IndicatorCode': 'E15c_sponsor_publicity', 'IndicatorName': 'Banning the publicity of financial or other sponsorship or support by the tobacco industry of events, activities, individuals', 'Language': 'EN'}
{'IndicatorCode': 'E17_csr_promo_others', 'IndicatorName': 'Ban on entities other than tobacco companies/tobacco industry publicizing the Corporate Social Responsibility activities of the tobacco companies', 'Language': 'EN'}
{'IndicatorCode': 'E22_subnational_exists', 'IndicatorName': 'Subnational bans on tobacco advertising, promotion and sponsorship', 'Language': 'EN'}
{'IndicatorCode': 'E25_ban_display_pt_of_sale', 'IndicatorName': 'Ban on display of tobacco products at points of sale', 'Language': 'EN'}
{'Indic

**Retrieving specific Indicators** 

In [16]:
# Send a GET request to the API endpoint
url = "https://ghoapi.azureedge.net/api/Indicator?$filter=IndicatorName eq 'Estimate of current tobacco use prevalence (%)'"
response = requests.get(url)

# Check the response status codeb
if response.status_code == 200:
    resp = response.json()
    data = resp['value']
    # Process the data as needed
    for item in data:
        value = item
        print(value)
        # Process each item in the data

else:
    print("Request failed with status code:", response.status_code)

{'IndicatorCode': 'M_Est_tob_curr', 'IndicatorName': 'Estimate of current tobacco use prevalence (%)', 'Language': 'EN'}


**Retrieving indicator data**

In [17]:
# Send a GET request to the API endpoint
url = "https://ghoapi.azureedge.net/api/M_Est_tob_curr"
response = requests.get(url)

# Check the response status code
if response.status_code == 200:
    resp = response.json()
    data = resp['value']
    # Process the data as needed
    for item in data:
        value = item
        print(value)
        # Process each item in the datab

else:
    print("Request failed with status code:", response.status_code)

{'Id': 27792760, 'IndicatorCode': 'M_Est_tob_curr', 'SpatialDimType': 'COUNTRY', 'SpatialDim': 'DZA', 'TimeDimType': 'YEAR', 'TimeDim': 2020, 'Dim1Type': 'SEX', 'Dim1': 'BTSX', 'Dim2Type': None, 'Dim2': None, 'Dim3Type': None, 'Dim3': None, 'DataSourceDimType': None, 'DataSourceDim': None, 'Value': '21.5 [16.2-26.9]', 'NumericValue': 21.5, 'Low': 16.2, 'High': 26.9, 'Comments': 'Projected from surveys completed prior to 2020.', 'Date': '2022-01-17T09:29:56.497+01:00', 'TimeDimensionValue': '2020', 'TimeDimensionBegin': '2020-01-01T00:00:00+01:00', 'TimeDimensionEnd': '2020-12-31T00:00:00+01:00'}
{'Id': 27792761, 'IndicatorCode': 'M_Est_tob_curr', 'SpatialDimType': 'COUNTRY', 'SpatialDim': 'BEN', 'TimeDimType': 'YEAR', 'TimeDim': 2020, 'Dim1Type': 'SEX', 'Dim1': 'BTSX', 'Dim2Type': None, 'Dim2': None, 'Dim3Type': None, 'Dim3': None, 'DataSourceDimType': None, 'DataSourceDim': None, 'Value': '6.1 [4.6-7.6]', 'NumericValue': 6.1, 'Low': 4.6, 'High': 7.6, 'Comments': 'Projected from survey

**Filtering data**

In [18]:
# Send a GET request to the API endpoint
url = "https://ghoapi.azureedge.net/api/M_Est_tob_curr?$filter=Dim1 eq 'MLE'"
response = requests.get(url)

# Check the response status code
if response.status_code == 200:
    resp = response.json()
    data = resp['value']
    # Process the data as needed
    for item in data:
        value = item
        print(value)
        # Process each item in the data

else:
    print("Request failed with status code:", response.status_code)

{'Id': 27792929, 'IndicatorCode': 'M_Est_tob_curr', 'SpatialDimType': 'COUNTRY', 'SpatialDim': 'CPV', 'TimeDimType': 'YEAR', 'TimeDim': 2020, 'Dim1Type': 'SEX', 'Dim1': 'MLE', 'Dim2Type': None, 'Dim2': None, 'Dim3Type': None, 'Dim3': None, 'DataSourceDimType': None, 'DataSourceDim': None, 'Value': '17 [13.3-20.7]', 'NumericValue': 17.0, 'Low': 13.3, 'High': 20.7, 'Comments': None, 'Date': '2022-01-17T09:30:00.54+01:00', 'TimeDimensionValue': '2020', 'TimeDimensionBegin': '2020-01-01T00:00:00+01:00', 'TimeDimensionEnd': '2020-12-31T00:00:00+01:00'}
{'Id': 27792930, 'IndicatorCode': 'M_Est_tob_curr', 'SpatialDimType': 'COUNTRY', 'SpatialDim': 'CMR', 'TimeDimType': 'YEAR', 'TimeDim': 2020, 'Dim1Type': 'SEX', 'Dim1': 'MLE', 'Dim2Type': None, 'Dim2': None, 'Dim3Type': None, 'Dim3': None, 'DataSourceDimType': None, 'DataSourceDim': None, 'Value': '11.7 [9-14.3]', 'NumericValue': 11.7, 'Low': 9.0, 'High': 14.3, 'Comments': 'Projected from surveys completed prior to 2020.', 'Date': '2022-01-17


**Filtering indicator data by time dimension**

>> https://ghoapi.azureedge.net/api/WHOSIS_000001?$filter=Dim1 eq 'MLE' and date(TimeDimensionBegin) ge 2011-01-01 and date(TimeDimensionBegin) lt 2012-01-01

 

**Using null filter**

>> https://ghoapi.azureedge.net/api/WHOSIS_000001?$filter=Dim1 ne null

>> Or

>> https://ghoapi.azureedge.net/api/WHOSIS_000001?$filter=Dim1 eq null

## Retrieve and Preprocess Data: 

Use the OData API to fetch the required data based on the analysis goals. Apply data preprocessing steps like cleaning, filtering, and transforming the data as necessary to prepare it for analysis.

Defining the dashboard topic and information containe goals

- Top 10 global causes of death in 2019 global and by gender
- Tobacco consuption: https://www.who.int/data/gho/data/themes/topics/indicator-groups/indicator-group-details/GHO/tobacco-consumption
- MPOWER groups - p Protect from tobacco smoke: https://www.who.int/data/gho/indicator-metadata-registry/imr-details/341
- Daily smoking of any tobacco product (age-standardized rate): https://www.who.int/data/gho/indicator-metadata-registry/imr-details/347

- Annual tobacco tax revenues - total excise: https://www.who.int/data/gho/indicator-metadata-registry/imr-details/4604
- Annual tobacco tax revenues - total: https://www.who.int/data/gho/indicator-metadata-registry/imr-details/4607
- Prevalence of current tobacco use among adolescents: https://www.who.int/data/gho/indicator-metadata-registry/imr-details/mca-prevalence-of-current-tobacco-use-among-adolescents
- NUMBER OF PLACES SMOKE-FREE (national legislation) (Tobacco control: Protect): https://www.who.int/data/gho/data/indicators/indicator-details/GHO/gho-tobacco-control-protect-national-legislation-number-of-places-smoke-free
- Annual budget for tobacco control in US$ at official exchange rate: https://www.who.int/data/gho/indicator-metadata-registry/imr-details/1290