# SOIL & FOOD DATA - So what and what now?

## Data Sources

- **The Global Soil Dataset for Earth System Modeling** Soil Organic Carbon Density dataset at 5 minute resolution
    - Land-Atmosphere Interaction Research Group at Sun Yat-sen University
        - http://globalchange.bnu.edu.cn/research/soilwd.jsp
- **FAOSTAT** Trade: Crops and livestock products | Trade: Detailed trade matrix | Production: Crops and livestock products
    - Food and Agriculture Organization of the United Nations
        - https://www.fao.org/faostat/en/#data/TCL
        - https://www.fao.org/faostat/en/#data/TM
        - https://www.fao.org/faostat/en/#data/QCL
        
## Non-geographical Plotting

I'll pull in the dataset I already prepared of Soil Organic Carbon Density, and I'll load the food production and trade datasets to work together with those.

In [None]:
# # view plots inside the notebook
# %matplotlib inline  
# import package dependencies for environment
# import numpy as np
import pandas as pd
import geopandas as gpd
# import matplotlib.pyplot as plt
# import plotly.offline as pyo
# # Set notebook mode to work in offline
# pyo.init_notebook_mode()
# import plotly.io as pio
# import plotly.figure_factory as ff
# import plotly.express as px
# import plotly.graph_objects as go # or plotly.express as px

In [None]:
# load the cached variables from earlier SOCD analysis
%store -r gdf2flat
# # load the unique lists of depths from cache also
# %store -r depths

In [None]:
# # write the gdf2flat to csv file for app build with less processing steps
# # better for now save it to my hack folder until I can configure storage specific for the app deployment
# # commented out because this was superseded by export later of further processed file
# gdf2flat.to_csv('/Users/kathrynhurchla/Documents/hack_mylfs_GitHub_projects/gdf2flat.csv')

In [None]:
# # view the top rows in dataframe
# gdf2flat.head()

In [None]:
# # drop the extra previous 'index' column, and
# # group by depth and count group records with pandas
# # shows that there are not records for all depths at all locations; 
# # with the first depth containing the most
# gdf2flat.drop('index', axis=1).groupby('depth').count()

In [None]:
# take only the 4.5 depth records and
# reset the index and drop the extra previous index column
gdf2flatsurface = gdf2flat[gdf2flat['depth'] == 4.5].reset_index(drop=True)
# # view the top rows of result
# gdf2flatsurface.head()

In [None]:
# quick view of resulting geodataframe
# for just the surface depth to 4.5cm, whilst
# dropping unnecessary previous 'index' column
gdf2flatsurface = gdf2flatsurface.drop('index', axis=1)
# # view the result
# gdf2flatsurface

In [None]:
# use the world dataset from geopandas to get a link of points set 
# to grab each soil measurement location's country from
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))

In [None]:
# join my points geometry from gdf2flatsurface to the world and 
# get the countries they are residing in using a spatial join (sjoin)
result = gpd.sjoin(gdf2flatsurface, world, how='left').reset_index(drop=True)

In [None]:
# # view all joined column names now
# result.columns

In [None]:
# # view how it looks now with joined columns
# # from the looks of it I get some NaN values, but
# # for this prototype I'll continue working with it and test how it treats those on a map
# result.tail(65)

In [None]:
# # count the NaN i.e. null value rows in the joined dataframe
# result.isna().sum()

In [None]:
# # a percent calculation to count the NaN i.e. null value rows or total values in the joined dataframe
# 85401/2166784

In [None]:
# # grab the NaN value rows in a dataframe
# result_isna = result[result['index_right'].isna()]
# # quick view to confirm it worked as anticipated
# # result_isna.head()

In [None]:
# # plot on a map view the NaN rows to see where they appear (rows without a country match in world low res geopandas built in dataset)
# result_isna.plot()

### Dropping values off land

Mapping the values in the SOCD dataset at surface level (4.5cm depth) which did not match a value in the geopandas world dataset for a correlating country value shows that these are off land location points that appear to be just off the coastal regions of continents, and therefore are more or less irrelevant to our soil story for food production, at least generally speaking. I will drop them for the purposes of this story.

In [None]:
# drop NaN i.e. null values from the result of linking gdf2flatsurface to add country from world dataset
# and drop the extra index column
gdf2flatsurfacecountry = result.dropna().reset_index(drop=True)

In [None]:
# # what's the count now, as compared to the group done earlier for this depth?
# # by counting the first column by index name
# gdf2flatsurfacecountry['lon'].count()

In [None]:
# # view column headers again with just first data row as an example of record values
# gdf2flatsurfacecountry.head(1)

In [None]:
# rename name column to country_name, in place to replace column name in same column
# looks like this worked in results checked but it did return a deprecation :
# /Users/kathrynhurchla/opt/anaconda3/envs/envsoil/lib/python3.9/site-packages/pandas/core/frame.py:5039: SettingWithCopyWarning: 
# A value is trying to be set on a copy of a slice from a DataFrame

# See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
#   return super().rename(
gdf2flatsurfacecountry.rename(columns={"name": "country_name", "pop_est": "country_pop_est", "iso_a3": "country_iso_a3", "gdp_md_est": "country_gdp_md_est"}, inplace=True)

In [None]:
# # view the data types of columns to concatenate for URL search
# gdf2flatsurfacecountry.dtypes

In [None]:
# # view the top rows of data
# gdf2flatsurfacecountry.head(1)

In [None]:
# add a column that could hold a website URL address for action steps for specific user audience:
# where 'country_name' column contains the name of the country where the SOCD soil reading was taken, and
# which also correlates to where the food is produced that is exported to audience's chosen country where they eat
# aligns with the call to action buttons on wireframe at: https://miro.com/app/board/o9J_lhkKkOA=/

# let's come back to this in next sprint, ... see below
# # separate the concatenation step to try to avoid a SettingWithCopyWarning in Pandas
# learnmoreURL = 'https://www.ecosia.org/soil%20health%20regenerative%20agriculture%20' + gdf2flatsurfacecountry['country_name']
# advocateURL = 'https://www.ecosia.org/advocate%20for%20soil%20health%20regenerative%20agriculture%20' + gdf2flatsurfacecountry['country_name']
# investURL = 'https://www.ecosia.org/invest%20in%20soil%20health%20regenerative%20agriculture%20' + gdf2flatsurfacecountry['country_name']

# ...and use empty string for now to hold the column place
learnmoreURL = ''
advocateURL = ''
investURL = ''

In [None]:
# now separately assign the new URL variables to new columns appended to geopandas dataframe
# test... do I need to loop over this?
gdf2flatsurfacecountry['learnmoreURL'] = learnmoreURL
gdf2flatsurfacecountry['advocateURL'] = advocateURL
gdf2flatsurfacecountry['investURL'] = investURL

In [None]:
# # view the end data rows of the result
# gdf2flatsurfacecountry.tail(1)

In [None]:
# write a CSV of only the 4.5 depth should the app be too slow or to start with
# commented out in lieu of exporting for app the merged file with food trade links later in notebook
gdf2flatsurfacecountry.to_csv('/Users/kathrynhurchla/Documents/hack_mylfs_GitHub_projects/gdf2flatsurface.csv')

In [None]:
# # groupby depth with plotly.io based on example here: https://plotly.com/python/group-by/
# # as a test, I'm not really clear what this is showing,
# # or if I need to iterate over the records still, e.g. to show a mean
# # depths contains array([  4.5       ,   9.10000038,  16.60000038,  28.89999962,
# #         49.29999924,  82.90000153, 138.30000305, 229.6000061 ])

# depth = depths
# SOCD = gdf2flat['SOCD']

# data = [dict(
#   type = 'scatter',
#   x = depth,
#   y = SOCD,
#   mode = 'markers',
#   markersize = 5,
#   transforms = [dict(
#     type = 'groupby',
#     groups = depths,
#     styles = [
#         dict(target =    4.5       , value = dict(marker = dict(color = 'Set1[1]'))),
#         dict(target =    9.10000038, value = dict(marker = dict(color = 'Set1[2]'))),
#         dict(target =   16.60000038, value = dict(marker = dict(color = 'Set1[3]'))),
#         dict(target =   28.89999962, value = dict(marker = dict(color = 'Set1[4]'))),
#         dict(target =   49.29999924, value = dict(marker = dict(color = 'Set1[5]'))),
#         dict(target =   82.90000153, value = dict(marker = dict(color = 'Set1[6]'))),
#         dict(target =  138.30000305, value = dict(marker = dict(color = 'Set1[7]'))),
#         dict(target =  229.6000061 , value = dict(marker = dict(color = 'Set1[8]'))),
#     ]
#   )]
# )]

# fig_dict = dict(data=data)
# pio.show(fig_dict, validate=False)

In [None]:
# # check my working directory
# !pwd

In [None]:
# # look for the file path of the trade file
# !ls ../data

In [None]:
# # check for the file name of trade file
# !ls ../data/Trade_CropsLivestock_E_All_Data_(Normalized)/Trade_Crops_Livestock_E_All_Data_(Normalized).csv

In [None]:
# # load in the food trade data from git repository origin directory
# dftrade = pd.read_csv('../data/Trade_CropsLivestock_E_All_Data_(Normalized)/Trade_Crops_Livestock_E_All_Data_(Normalized).csv')

In [None]:
# # view top of the dataframe
# dftrade.head()
# # unfortunately my git lfs large file storage was shut down and is no longer showing file
# # until I can correct that or take the repository off line, I will try to load the file from elsewhere

In [None]:
# # view the filenames I have in my temporary storage directory for large data files
# # since I exceeded the free amount of GitHub large file storage 
# !ls /Users/kathrynhurchla/Documents/hack_mylfs_GitHub_projects

In [None]:
# # load in the food trade data copy freshly downloaded from an alternate directory
# # adding , encoding = "ISO-8859-1" to resolve "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 158927: invalid continuation byte"
# # alternately use the alias 'latin' for encoding
# dftrade = pd.read_csv('/Users/kathrynhurchla/Documents/hack_mylfs_GitHub_projects/Trade_Crops_Livestock_E_All_Data_(Normalized).csv', encoding = "ISO-8859-1")

In [None]:
# # view a sample top/bottom of the dataframe
# dftrade

In [None]:
# # view the column variables
# dftrade.columns

In [None]:
# but ideally what I want is to see which country exports to which country, in pairs in a record
# load in the food trade detailed matrix copy freshly downloaded from https://www.fao.org/faostat/en/#data/TM to an alternate directory
# adding , encoding = "ISO-8859-1" to resolve "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf4 in position 38698: invalid continuation byte"
# alternately use the alias 'latin' for encoding
dftrade_mx = pd.read_csv('/Users/kathrynhurchla/Documents/hack_mylfs_GitHub_projects/Trade_DetailedTradeMatrix_E_All_Data_(Normalized).csv', encoding = "ISO-8859-1")
# dftrade_mx

In [None]:
# # view a unique list of the element codes/elements
# # The input to this function needs to be one-dimensional, so multiple columns will need to be combined.
# # select the values and then view them in a flattened numpy array
# pd.unique(dftrade_mx[['Element Code','Element']].values.ravel('K'))

In [None]:
# # find the value of Element Code for Export elements
# print(str('Export Quantity = Element Code: '))
# print(dftrade_mx.loc[dftrade_mx['Element'] == 'Export Quantity', 'Element Code'].iloc[0])

In [None]:
# # view the unique combination of area and area codes
# # where 'Area Code' in table is referred to as Country Code (and/or Country Group Code for 5100+) in the Definitions and standards 
# # on FAO website at https://www.fao.org/faostat/en/#data/QCL
# # see the last records which are groupings of countries
# # note FAO provides downloadable key file of this Country Code with ISO2, ISO3, and M49 codes for each country
# # if I need it for any linkage
# dftrade_mx.groupby(['Reporter Country Code','Reporter Countries']).size()

In [None]:
# filter for just the 'Export  Quantity' rows by its element code identified earlier
dftrade_mx_xq = dftrade_mx[dftrade_mx['Element Code'] == 5910].reset_index(drop=True)
# # view first data row of result
# dftrade_mx_xq.head(1)

In [None]:
# drop columns I do not need
dftrade_mx_xq = dftrade_mx_xq.drop('Element Code', axis=1)
dftrade_mx_xq = dftrade_mx_xq.drop('Year Code', axis=1)

In [None]:
# # view result top data row
# dftrade_mx_xq.head(1)

In [None]:
# # view the contents of Reporter Country in trade data
# dftrade_mx_xq['Reporter Countries'].unique()

In [None]:
# # use a boolean to check whether the available reporter and partner countries are the same
# dftrade_mx_xq['Reporter Countries'].unique() == dftrade_mx_xq['Partner Countries'].unique()

In [None]:
# # I'd like to see if I can limit the output by year due to the large size of this file, and
# # because the most recent year is most valuable to the map portion of the web app at least
# # first define groups
# groups = dftrade_mx_xq.groupby(['Partner Countries', 'Year'])

In [None]:
# # now use the describe method for summary stats based on the group filter
# for key, group in groups:
#     print(key)
#     print(group.describe())

In [None]:
# # view the most recent year for each partner country
# # group by partner countries, sorted alpha asc, then by year and calculate the max year
# maxYear = dftrade_mx_xq.groupby(['Partner Countries'], sort=True).agg(max_Year=('Year', 'max'))
# # include a string title
# print('Most Recent Year Trade by Partner Countries')
# # showing all rows of output with no limit for this statement only
# with pd.option_context('display.max_rows', None, 'display.max_columns', None):  # more options can be specified also
#     # print grouped maxYear defined above
#     print(maxYear)

In [None]:
# commented out due to IndexingError: Unalignable boolean Series provided as indexer 
# (index of the boolean Series and of the indexed object do not match).
# # mask which countries have 2019 as max year
# mask = maxYear['max_Year'] == 2019
# # show the dataframe excluding the mask rows, 
# # i.e. only rows with a different max year
# dftrade_mx_xq[~mask]

In [None]:
# # try another quick and dirty way to see years by just reversing sort
# # include a string title
# print('Partner Countries with Most Recent Year Trade Not in 2019')
# # print grouped maxYear defined above sorted by year asc
# # taking head i.e. top rows until I see 2019
# print(maxYear.sort_values(['max_Year', 'Partner Countries'], ascending=True).head(23))

In [None]:
# # of the countries not receiving exports in 2019, 
# # did any of them record exporting food in 2019?
# # again for quick and dirty replace the country group field
# # group by reporter countries this time, sorted alpha asc, then by year and calculate the max year
# maxExportYear = dftrade_mx_xq.groupby(['Reporter Countries'], sort=True).agg(max_ExportYear=('Year', 'max'))
# # include a string title
# print('Reporter Countries with Most Recent Year Trade Not in 2019')
# # print grouped maxExportYear defined above sorted by year asc
# # taking head i.e. top rows until I see 2019
# print(maxExportYear.sort_values(['max_ExportYear', 'Reporter Countries'], ascending=True).head(11))

In [None]:
# for my web app I will remove the export quantity trade rows where year is not 2019,
# i.e. I will keep only the most recent export dataset available
# naming it to a new dataframe whilst resetting index and dropping the previous index
dftrade_mx_xq2019 = dftrade_mx_xq.drop(dftrade_mx_xq.loc[dftrade_mx_xq['Year']!=2019].index, inplace=False).reset_index(drop=True) # note False is default

In [None]:
# # validate my work by viewing the top few rows of reindexed new dataframe
# dftrade_mx_xq2019.head(3)

### Merge soil carbon data with food trade data

Using the partner country code from the trade matrix data as the selection dataset for our app audience member, I want to make that my left table, since the soil organic carbon content measurements will only be relevant when they relate to the audience, i.e. I only need to keep them if were measured in a country which relates to a food exported from that same country and to the country the audience selects as their location.

In [None]:
# # review the column variable names in my socd data to find link variable
# gdf2flatsurfacecountry.columns

In [None]:
# # view the contents of country_name
# gdf2flatsurfacecountry['country_name'].unique()

*At a glance, I expect I could have some that don't match up by name.*

Since I'm working today without internet, I'll continue and test what returns null matches for the purposes of this dataset only and resolve from there. It would be best to link on an iso (international standards organization) code 2 or 3, but without the ability to download the FAO key table today, I only have that in one of my two source datasets here I want to link.

### Merge soil data with only 2019 trade.

I've commented out code cells for all trade, in lieu of merging instead with the new dataframe of only the rows with the most recent year of trade recorded 2019.


In [None]:
# # using pandas merge function, link the trade matrix and socd dataframes
# # with left data as food trade
# dftrade_mx_xq_socdsurface = pd.merge(left=dftrade_mx_xq, right=gdf2flatsurfacecountry,
#                                      left_on='Reporter Countries',
#                                      right_on='country_name',
#                                      how='left')

In [None]:
# # write a CSV of only the 4.5 depth socd merged with food trade data
# dftrade_mx_xq_socdsurface.to_csv('/Users/kathrynhurchla/Documents/hack_mylfs_GitHub_projects/dftrade_mx_xq_socdsurface.csv')

To try to speed up the merge, I'll pull in a key from FAOSTAT to connect its country code with the ISO_3 code in the key. This will allow me to merge on ISO_3 instad of the longer string country names, because ISO_3 is already available in the soil data table which came from the geopandas world dataset along with/when I pulled in the country name from from that standardized source.

In [None]:
# read in the FAOSTAT key dataset as a variable
faoSTATkey = pd.read_csv('/Users/kathrynhurchla/Documents/hack_mylfs_GitHub_projects/FAOSTAT_data_11-26-2021.csv')
# # view the top rows
# faoSTATkey.head(3)

In [None]:
# For dftrade_mx_xq2019:
# using pandas merge function, link the trade matrix reporter country code with key to append its ISO_3 code
# with left data as food trade matrix
dftrade_mx_xq2019ISO3 = pd.merge(left=dftrade_mx_xq2019, right=faoSTATkey[['Country Code','ISO3 Code']],
                                     # key column from left dataframe
                                     left_on='Reporter Country Code',
                                     # key column from right dataframe
                                     right_on='Country Code',
                                     # merge as a 'left' join type, and 
                                     # drop the duplicate key column used for join from right dataframe
                                     how='left').drop('Country Code', 1)

In [None]:
# # are there NaN values in ISO3?
# # grab the NaN value rows in a dataframe
# dftrade_mx_xq2019ISO3_isna = dftrade_mx_xq2019ISO3[dftrade_mx_xq2019ISO3['ISO3 Code'].isna()]
# # # quick view to confirm it worked as anticipated
# # dftrade_mx_xq2019ISO3_isna.head()
# # view unique list of Reporter Countries with NaN ISO3
# dftrade_mx_xq2019ISO3_isna.groupby(['Reporter Country Code','Reporter Countries']).size()

In [None]:
# # view unique Reporter Countries with ISO3 COde
# reporterCountries = dftrade_mx_xq2019ISO3.groupby(['Reporter Countries','ISO3 Code']).size()
# # showing all rows of output with no limit for this statement only
# with pd.option_context('display.max_rows', None, 'display.max_columns', None):  # more options can be specified also
#     # print grouped variable defined above
#     print(reporterCountries)

In [None]:
# There is no "China" in the trade matrix. I will fill China, mainland with China's ISO3 code 'CHN'
# find rows with the Reporter Country Code 41 (for China, mainland), 
# locate the 'ISO3 Code' column in those rows and set it to 'CHN'
dftrade_mx_xq2019ISO3.loc[dftrade_mx_xq2019ISO3['Reporter Country Code'] == 41, 'ISO3 Code'] = 'CHN'

In [None]:
# rename appended ISO3 column to clarify that it's for Reporter Country in trade matrix
dftrade_mx_xq2019ISO3.rename(columns={"ISO3 Code": "Reporter Country ISO3"}, inplace=True)
# # check the result
# dftrade_mx_xq2019ISO3.head(3)

In [None]:
# view number of rows and columns
dftrade_mx_xq2019ISO3.shape

In [None]:
# write a CSV of only the 2019 export quantity food trade data with ISO3 code appended for Reporter Countries
dftrade_mx_xq2019ISO3.to_csv('/Users/kathrynhurchla/Documents/hack_mylfs_GitHub_projects/dftrade_mx_xq2019ISO3.csv')

In [None]:
# # I'm having trouble merging these 472,753 rows so I will try to remove some unnecessary columns
# dftrade_mx_xq2019ISO3 = dftrade_mx_xq2019ISO3.drop('Element', axis=1)
# dftrade_mx_xq2019ISO3 = dftrade_mx_xq2019ISO3.drop('Year', axis=1)
# dftrade_mx_xq2019ISO3 = dftrade_mx_xq2019ISO3.drop('Reporter Country Code', axis=1)


In [None]:
# dftrade_mx_xq2019ISO3.head(3)

In [None]:
# # are there NaN values in ISO3 now, which was renamed?
# # grab the NaN value rows in a dataframe
# dftrade_mx_xq2019ISO3_isna = dftrade_mx_xq2019ISO3[dftrade_mx_xq2019ISO3['Reporter Country ISO3'].isna()]
# # # quick view to confirm it worked as anticipated
# # dftrade_mx_xq2019ISO3_isna.head()
# # view unique list of Reporter Countries with NaN ISO3
# dftrade_mx_xq2019ISO3_isna.groupby(['Reporter Country Code','Reporter Countries']).size()

In [None]:
# # view unique Reporter Countries with ISO3 Code again now, and we should see China, mainland with CHN included
# reporterCountries = dftrade_mx_xq2019ISO3.groupby(['Reporter Countries','Reporter Country ISO3']).size()
# # showing all rows of output with no limit for this statement only
# with pd.option_context('display.max_rows', None, 'display.max_columns', None):  # more options can be specified also
#     # print grouped variable defined above
#     print(reporterCountries)

In [None]:
# # Commented out because this is hanging up in the Notebook and may be too large
# # I will export the processed file before and merge in a script
# # For dftrade_mx_xq2019:
# # using pandas merge function, link the trade matrix and socd dataframes
# # with left data as food trade
# # using as link 'Reporter ISO3 code' and 'country_iso_a3'
# # instead of the slow to merge 'Reporter Countries' and 'country_name'
# dftrade_mx_xq2019_socdsurface = pd.merge(left=dftrade_mx_xq2019ISO3, right=gdf2flatsurfacecountry,
#                                      left_on='Reporter Country ISO3',
#                                      right_on='country_iso_a3', 
#                                      # merge as a 'left' join type, and 
#                                      # drop the duplicate key column used for join from right dataframe
#                                      how='left').drop('country_iso_a3', 1)

In [None]:
# # view the number or rows and columns of result
# dftrade_mx_xq2019_socdsurface.shape()
# # view the top data rows of the result
# dftrade_mx_xq2019_socdsurface.head()

In [None]:
# # For dftrade_mx_xq2019_socdsurface:
# # write a CSV of only the 4.5 depth socd merged with food trade data
# dftrade_mx_xq2019_socdsurface.to_csv('/Users/kathrynhurchla/Documents/hack_mylfs_GitHub_projects/dftrade_mx_xq2019_socdsurface.csv')

### Plot food export partners matrix

Now that I have a dataset showing where food comes from and where it's exported to, see if I can show this visually.

In [None]:
# using Plotly Graph Objects (go), plot lines on a map
# based on an example at https://plotly.com/python/lines-on-maps/
# world scope with locations by country names (collect an ISO-3 if names doesn't work well, i.e. gaps)
# dftrade_mx_xq for paths
# see for projection_type options: https://plotly.com/python/reference/layout/geo/#layout-geo-projection-type

# fig = go.Figure()

# fig.add_trace(go.Scattergeo(
#     locationmode = 'country names',
#     locations = dftrade_mx_xq['Reporter Countries'],
#     hoverinfo = 'text',
# #     # string concatenation in pandas for hover text
# #     # also a <br> within quotes can put that data on a new line in the hover text optionally
# #     text = dftrade_mx_xq['Reporter Countries'].astype(str) + " exported " +  dftrade_mx_xq["Value"].astype(str) + " " + dftrade_mx_xq["Unit"].astype(str) + " of " + dftrade_mx_xq["Item"].astype(str) + " to " + dftrade_mx_xq["Partner Countries"].astype(str) + " in " + dftrade_mx_xq["Year"].astype(str),
#     text = dftrade_mx_xq["Item"]
#     mode = 'markers',
#     marker = dict(
#         size = 2,
#         color = 'rgb(255, 0, 0)',
#         line = dict(
#             width = 3,
#             color = 'rgba(68, 68, 68, 0)'
#         )
#     )))

# fig.add_trace(
#     go.Scattergeo(
#         locationmode = 'country names',
# #         hoverinfo = 'text',
# #         text = dftrade_mx_xq['Item'],
#         mode = 'lines',
#         line = dict(width = 1,color = 'red'),
#         opacity = 0.5
#     )
# )

# fig.update_layout(
#     title_text = 'Food Trade<br>(Hover for item exported)',
#     showlegend = False,
#     geo = go.layout.Geo(
#         scope = 'world',
#         projection_type = 'winkel tripel',
#         showland = True,
#         landcolor = 'rgb(243, 243, 243)',
#         countrycolor = 'rgb(204, 204, 204)',
#     ),
#     height=700,
# )

# fig.show()

In [None]:
# # try with gdf2flatsurface which I have lat long values for
# fig = go.Figure()

# fig.add_trace(go.Scattergeo(
#     locationmode = 'country names',
#     locations = dftrade_mx_xq['Reporter Countries'],
#     hoverinfo = 'text',
# #     # string concatenation in pandas for hover text
# #     # also a <br> within quotes can put that data on a new line in the hover text optionally
# #     text = dftrade_mx_xq['Reporter Countries'].astype(str) + " exported " +  dftrade_mx_xq["Value"].astype(str) + " " + dftrade_mx_xq["Unit"].astype(str) + " of " + dftrade_mx_xq["Item"].astype(str) + " to " + dftrade_mx_xq["Partner Countries"].astype(str) + " in " + dftrade_mx_xq["Year"].astype(str),
#     text = dftrade_mx_xq["Item"]
#     mode = 'markers',
#     marker = dict(
#         size = 2,
#         color = 'rgb(255, 0, 0)',
#         line = dict(
#             width = 3,
#             color = 'rgba(68, 68, 68, 0)'
#         )
#     )))

# fig.add_trace(
#     go.Scattergeo(
#         locationmode = 'country names',
# #         hoverinfo = 'text',
# #         text = dftrade_mx_xq['Item'],
#         mode = 'lines',
#         line = dict(width = 1,color = 'red'),
#         opacity = 0.5
#     )
# )

# fig.update_layout(
#     title_text = 'Food Trade<br>(Hover for item exported)',
#     showlegend = False,
#     geo = go.layout.Geo(
#         scope = 'world',
#         projection_type = 'winkel tripel',
#         showland = True,
#         landcolor = 'rgb(243, 243, 243)',
#         countrycolor = 'rgb(204, 204, 204)',
#     ),
#     height=700,
# )

# fig.show()

In [None]:
# # run through a standalone (within this single cell) test with Dash 
# # for a web app to build outside of jupyter notebook
# import plotly.graph_objects as go # or plotly.express as px
# fig = go.Figure() # or any Plotly Express function e.g. px.bar(...)
# fig.add_trace( ... )
# fig.update_layout( ... )

# import dash
# import dash_core_components as dcc
# import dash_html_components as html

# app = dash.Dash()
# app.layout = html.Div([
#     dcc.Graph(figure=fig)
# ])

# app.run_server(debug=True, use_reloader=False)  # Turn off reloader if inside Jupyter