### Visualization Project Part 1: Finding your Data
---
Locate a dataset that you are interested in working with. The data should be sufficiently complex that you can ask lots of questions about it and engage in creative design techniques, but not so complex that you need specialized hardware or algorithmic approaches to analyze. While you are welcome to use any data you’d like, I recommend that your datasets are tabular (e.g., CSV, TSV, SQL, etc.), contain 5,000 or fewer datapoints (on the order of one hundred or so tends to be sufficiently interesting without causing lag in Altair), and is data that you’re comfortable discussing as part of the course (e.g., avoid data that is overly private or classified). 

Discuss your dataset, including the data’s source, key attributes/dimensions of the data, and your goals for working with that data (i.e., what are the key questions you want to answer). Identify existing relevant visualizations for working with that data (either using the same data, showing the same concepts, or just that might provide some inspiration) and critique those visualizations based on the practices from this module. What works well? What might need improvement or to change to answer your target questions? 

### Part 1 Answers:

##### Dataset details:
The dataset used is from the Union of Concerned Scientists (UCSUSA) which details all openly-known satellites orbitting the earth at the time of previous update - January 1, 2023.

- **Source:** *Accessed 12/1/2023* | UCS-Satellite-Database-Officialname-1-1-2023.csv | https://www.ucsusa.org/resources/satellite-database
- 6718 satellites listed.
- 68 Features (columns) for each satellite including name, country of owner, purpose, orbital information, launch information, etc..

##### Preliminary Goals for this Data:

- Determine which country has the most satellites in orbit currently and execute a method that allows users to reach increased depths of information through this visualization, likely through grouping/aggregating features.
- Create a compelling way to identify orbital information about satellites. Look into different orbit features which may impact lifespan and/or see which orbiting altitudes are most congested.
- Feature engineer interesting statistics about the age of the satellites using life expectancy and launch date information.

##### Existing Visualizations:
UCSUSA has a visualization highlighting each country on an image of a map that has satellites or not. They also distinguish between countries that launch satellites or not as well as have a slider depicting the same information from either 1966 or 2020. You can see it at the website above.

**Pros:**

- This visualization works well for conveying which countries poses satellites at a glance, making it especially easy to find the answer for specific countries.
- The color orange in juxtaposition to the grey map makes the information easily identifiable.
    - Also, the textured orange is holds all the positives previously mentioned while still being easily distinguishable from the normal orange.

**Cons:**

- Country names are pretty small which may not be ideal for users unfamiliar with geography.
- The slider used to switch between 1966 and 2020, while intuitive, seems frivolous. Especially since no mention of a purpose is mentioned.
- Displaying a tooltip while hovering over each country giving more data would be a nice addition instead or in conjunction.

Overall, I believe it to be a successful visualization but has limited uses as it doesn't answer more interesting questions that live within this dataset. Giving the user the ability to reach more information depth through other methods of interaction would help improve this execution.

In [53]:
# Import necessary packages and data.
import pandas as pd
import altair as alt
alt.data_transformers.enable('vegafusion')
from vega_datasets import data
import numpy as np

sat_df = pd.read_csv('UCS-Satellite-Database-1-1-2023.csv')
country_codes_df = pd.read_csv('iso_3166_country_codes.csv')

In [54]:
# Preliminary EDA
print(sat_df.shape)
print(sat_df.columns)

(6718, 68)
Index(['Name of Satellite, Alternate Names',
       'Current Official Name of Satellite', 'Country/Org of UN Registry',
       'Country of Operator/Owner', 'Operator/Owner', 'Users', 'Purpose',
       'Detailed Purpose', 'Class of Orbit', 'Type of Orbit',
       'Longitude of GEO (degrees)', 'Perigee (km)', 'Apogee (km)',
       'Eccentricity', 'Inclination (degrees)', 'Period (minutes)',
       'Launch Mass (kg.)', ' Dry Mass (kg.) ', 'Power (watts)',
       'Date of Launch', 'Expected Lifetime (yrs.)', 'Contractor',
       'Country of Contractor', 'Launch Site', 'Launch Vehicle',
       'COSPAR Number', 'NORAD Number', 'Comments', 'Unnamed: 28',
       'Source Used for Orbital Data', 'Source', 'Source.1', 'Source.2',
       'Source.3', 'Source.4', 'Source.5', 'Source.6', 'Unnamed: 37',
       'Unnamed: 38', 'Unnamed: 39', 'Unnamed: 40', 'Unnamed: 41',
       'Unnamed: 42', 'Unnamed: 43', 'Unnamed: 44', 'Unnamed: 45',
       'Unnamed: 46', 'Unnamed: 47', 'Unnamed: 48', 'Unn

In [55]:
# Displays the owner of the most currently active satellites.
sat_df.groupby(['Country of Operator/Owner', 'Operator/Owner']).size().sort_values(ascending = False).head(20).reset_index()

Unnamed: 0,Country of Operator/Owner,Operator/Owner,0
0,USA,SpaceX,3349
1,United Kingdom,OneWeb Satellites,502
2,USA,"Planet Labs, Inc.",195
3,China,Chinese Ministry of National Defense,147
4,USA,Spire Global Inc.,127
5,Russia,Ministry of Defense,99
6,USA,Swarm Technologies,84
7,USA,"Iridium Communications, Inc.",75
8,China,Chang Guang Satellite Technology Co. Ltd.,53
9,USA,National Reconnaissance Office (NRO),50


In [56]:
# Check all column features for the amount of unique values.
sat_df = sat_df.drop(sat_df.iloc[:, 28:], axis = 1)
for col in sat_df:
    print(col, '|', sat_df[col].unique().size)

Name of Satellite, Alternate Names | 6709
Current Official Name of Satellite | 6698
Country/Org of UN Registry | 70
Country of Operator/Owner | 104
Operator/Owner | 639
Users | 20
Purpose | 31
Detailed Purpose | 53
Class of Orbit | 5
Type of Orbit | 9
Longitude of GEO (degrees) | 446
Perigee (km) | 783
Apogee (km) | 777
Eccentricity | 796
Inclination (degrees) | 450
Period (minutes) | 580
Launch Mass (kg.) | 567
 Dry Mass (kg.)  | 172
Power (watts) | 153
Date of Launch | 1187
Expected Lifetime (yrs.) | 29
Contractor | 560
Country of Contractor | 103
Launch Site | 39
Launch Vehicle | 164
COSPAR Number | 6707
NORAD Number | 6703
Comments | 1288


In [57]:
# Clean up the data for use.
country_codes_df = country_codes_df.rename({'name' : 'Country of Operator/Owner'}, axis = 1)
country_codes_df = country_codes_df.rename({'country-code' : 'Country_Code'}, axis = 1)

# Start with the country codes dataframe to match what is used in the satellite data.
# This will help during the merge to add a country code column for use in maps.
# Should've just changed the .csv at this point but here we are.
country_codes_df.loc[country_codes_df['Country of Operator/Owner'] == 'United States of America', 'Country of Operator/Owner'] = 'USA'
country_codes_df.loc[country_codes_df['Country of Operator/Owner'] == 'United Kingdom of Great Britain and Northern Ireland', 'Country of Operator/Owner'] = 'United Kingdom'
country_codes_df.loc[country_codes_df['Country of Operator/Owner'] == 'Russian Federation', 'Country of Operator/Owner'] = 'Russia'
country_codes_df.loc[country_codes_df['Country of Operator/Owner'] == 'Korea, Republic of', 'Country of Operator/Owner'] = 'South Korea'
country_codes_df.loc[country_codes_df['Country of Operator/Owner'] == 'Taiwan, Province of China', 'Country of Operator/Owner'] = 'Taiwan'
country_codes_df.loc[country_codes_df['Country of Operator/Owner'] == 'Iran (Islamic Republic of)', 'Country of Operator/Owner'] = 'Iran'
country_codes_df.loc[country_codes_df['Country of Operator/Owner'] == "Lao People's Democratic Republic", 'Country of Operator/Owner'] = 'Laos'
country_codes_df.loc[country_codes_df['Country of Operator/Owner'] == 'Viet Nam', 'Country of Operator/Owner'] = 'Vietnam'
country_codes_df.loc[country_codes_df['Country of Operator/Owner'] == 'Venezuela (Bolivarian Republic of)', 'Country of Operator/Owner'] = 'Venezuela'
country_codes_df.loc[country_codes_df['Country of Operator/Owner'] == 'Bolivia (Plurinational State of)', 'Country of Operator/Owner'] = 'Bolivia'


# Satellite df cleaning. Mostly misspellings or duplicate categories.
# Country fixes
sat_df.loc[sat_df['Country of Operator/Owner'] == 'ESA/', 'Country of Operator/Owner'] = 'USA' # This is the Hubble Telescope!
sat_df.loc[sat_df['Country of Operator/Owner'] == 'Czech Republic', 'Country of Operator/Owner'] = 'Czechia'
sat_df.loc[sat_df['Country of Operator/Owner'] == 'China ', 'Country of Operator/Owner'] = 'China'
sat_df.loc[sat_df['Country of Operator/Owner'] == "Sinapore", 'Country of Operator/Owner'] = 'Singapore'
# Operator name fixes.
sat_df.loc[sat_df['Operator/Owner'] == "Spacex", 'Operator/Owner'] = 'SpaceX'
sat_df.loc[sat_df['Operator/Owner'] == "US Air Force ", 'Operator/Owner'] = 'US Air Force'
# Purpose category fixes.
sat_df.loc[sat_df['Purpose'] == "Earth Observation ", 'Purpose'] = 'Earth Observation'
sat_df.loc[sat_df['Purpose'] == "Earth Observation/Navigation", 'Purpose'] = 'Earth Observation'
sat_df.loc[sat_df['Purpose'] == "Communications/Navigation", 'Purpose'] = 'Communications'
sat_df.loc[sat_df['Purpose'] == "Communications/Technology Development", 'Purpose'] = 'Communications'
sat_df.loc[sat_df['Purpose'] == "Earth Observation/Communications/Space Science", 'Purpose'] = 'Earth Observation'
sat_df.loc[sat_df['Purpose'] == "Earth Observation/Space Science", 'Purpose'] = 'Earth Observation'
sat_df.loc[sat_df['Purpose'] == "Space Observation", 'Purpose'] = 'Space Science'
# Remove commas from Launch mass as they don't play well with Vega-Altair.
sat_df['Launch Mass (kg.)'] = sat_df['Launch Mass (kg.)'].str.replace(',', '').astype(float)

# Merge country info with satellite dataframe.
print(country_codes_df.head())
sat_df = sat_df.merge(country_codes_df, on = 'Country of Operator/Owner', how = 'left')
# Display and check if correct.
display(sat_df.head(2))

  Country of Operator/Owner alpha-3  Country_Code
0               Afghanistan     AFG             4
1             Åland Islands     ALA           248
2                   Albania     ALB             8
3                   Algeria     DZA            12
4            American Samoa     ASM            16


Unnamed: 0,"Name of Satellite, Alternate Names",Current Official Name of Satellite,Country/Org of UN Registry,Country of Operator/Owner,Operator/Owner,Users,Purpose,Detailed Purpose,Class of Orbit,Type of Orbit,...,Expected Lifetime (yrs.),Contractor,Country of Contractor,Launch Site,Launch Vehicle,COSPAR Number,NORAD Number,Comments,alpha-3,Country_Code
0,1HOPSAT-TD (1st-generation High Optical Perfor...,1HOPSAT-TD,NR,USA,Hera Systems,Commercial,Earth Observation,Infrared Imaging,LEO,Non-Polar Inclined,...,0.5,Hera Systems,USA,Satish Dhawan Space Centre,PSLV,2019-089H,44859,Pathfinder for planned earth observation const...,USA,840.0
1,Aalto-1,Aalto-1,Finland,Finland,Aalto University,Civil,Technology Development,,LEO,Sun-Synchronous,...,2.0,Aalto University,Finland,Satish Dhawan Space Centre,PSLV,2017-036L,42775,Technology development and education.,FIN,246.0


In [58]:
# Ensure all countries have their proper codes by seeing which entries don't have one.
print(sat_df['Country of Operator/Owner'].where(sat_df['Country_Code'].isna()).unique())

[nan 'Multinational' 'ESA' 'USA/Argentina' 'France/Italy' 'China/Brazil'
 'China/France' 'USA/Canada/Japan' 'USA/Japan/Brazil' 'USA/Japan'
 'USA/Germany' 'France/Italy/Belgium/Spain/Greece' 'Greece/United Kingdom'
 'United Kingdom/ESA' 'USA/India/Singapore/Taiwan' 'ESA/Russia'
 'USA/France' 'Japan/Singapore' 'United Kingdom/Netherlands'
 'Morocco/Germany' 'India/France' 'USA/Canada' 'India/Canada'
 'France/Belgium/Sweden' 'Singapore/Taiwan' 'Poland/UK'
 'USA/United Kingdom/Italy' 'Turkmenistan/Monaco' 'France/Israel'
 'China/Italy']


In [59]:
# Check to see how many satellites are co-owned.
# Will leave these out of the calculation for individual countries for now.
sat_df.loc[sat_df['Country_Code'].isna(), 'Country of Operator/Owner'].size

168

In [61]:
# Check dates datatype for use later.
# Reformat into a datetime format.
print(sat_df['Date of Launch'].dtype)
sat_df.loc[sat_df['Date of Launch'] == '11/29/018', 'Date of Launch'] = '29-11-2018'
sat_df['Date of Launch'] = pd.to_datetime(sat_df['Date of Launch'], format = '%d-%m-%Y', errors = 'raise')
# Create a separate column for year of launch for easier utilization later.
sat_df['Year of Launch'] = pd.DatetimeIndex(sat_df['Date of Launch']).year

# Check date change for sanity check.
print(sat_df.iloc[1070])

object
datetime64[ns]
Name of Satellite, Alternate Names    Hubble Space Telescope (HST, Space Telescope)
Current Official Name of Satellite                           Hubble Space Telescope
Country/Org of UN Registry                                                      USA
Country of Operator/Owner                                                       USA
Operator/Owner                                     European Space Agency (ESA)/NASA
Users                                                                    Government
Purpose                                                               Space Science
Detailed Purpose                                                                NaN
Class of Orbit                                                                  LEO
Type of Orbit                                                    Non-Polar Inclined
Longitude of GEO (degrees)                                                      0.0
Perigee (km)                                          

### Visualizations

In [75]:
# Let's see what the data looks like organized and sorted by country and total count.
print(sat_df.groupby(['Country of Operator/Owner', 'alpha-3', 'Country_Code']).size().sort_values(ascending = False).head(20))

Country of Operator/Owner  alpha-3  Country_Code
USA                        USA      840.0           4512
China                      CHN      156.0            587
United Kingdom             GBR      826.0            561
Russia                     RUS      643.0            177
Japan                      JPN      392.0             88
India                      IND      356.0             59
Canada                     CAN      124.0             56
Germany                    DEU      276.0             48
Luxembourg                 LUX      442.0             45
Argentina                  ARG      32.0              38
Israel                     ISR      376.0             27
Spain                      ESP      724.0             26
France                     FRA      250.0             24
Finland                    FIN      246.0             23
South Korea                KOR      410.0             21
Italy                      ITA      380.0             15
Switzerland                CHE      756

In [63]:
country_select = alt.selection_point(fields = ['Country of Operator/Owner'], value = 'USA')
country_title = alt.TitleParams('Total Number of Active Satellites by Country',
                 subtitle = ['Filter Country by clicking on corresponding bar.', '(Source Data Updated 1/2023)'])

# Country totals graph.
# Added interactivity to select country and show 3 graphs.
country_bar_graph = alt.Chart(sat_df, title = country_title
    ).transform_aggregate(
        groupby = ['Country of Operator/Owner'],
        count = 'count()'
    ).transform_window(
        rank = 'rank(count)',
        sort = [alt.SortField('count', order = 'descending')]
    ).transform_filter(
        (alt.datum.rank <= 29)
    ).mark_bar().encode(
        x = alt.X('Country of Operator/Owner:N', 
            sort = '-y'),
        y = alt.Y('count:Q', title = 'Number of Satellites (Log Scale)').scale(type = 'log'),
        color = alt.Color('Country of Operator/Owner:N', legend = None).scale(scheme = 'tableau20'),
        stroke = alt.condition(country_select, alt.ColorValue('black'), alt.Color('Country of Operator/Owner:N', legend = None)),
        strokeWidth = alt.condition(country_select, alt.value(1), alt.value(0.5)),
        text = alt.Text('count:Q'),
        opacity = alt.condition(country_select, alt.value(0.7), alt.value(0.4))
    ).add_params(country_select)#.properties(width = 1000)

# Selected country's operator/owner graph.
operator_title = alt.TitleParams('Number of Satellites by Operator/Owner')
operator_bar_graph = alt.Chart(sat_df, title = operator_title
    ).mark_bar().encode(
        y = alt.Y('Operator/Owner:N',
                sort = '-x'),
        x = alt.X('count():Q'),
        text = alt.Text('count():Q'),
        color = alt.Color('count():O').scale(scheme = 'warmgreys')
    ).transform_filter(country_select)

# Selected country's purpose of satellite graph.
purpose_title = alt.TitleParams('Number of Satellites by Functionality')
purpose_bar_graph = alt.Chart(sat_df, title = purpose_title
    ).mark_bar().encode(
        x = alt.X('Purpose:N',
                sort = '-y'),
        y = alt.Y('count():Q'),
        text = alt.Text('count():Q'),
        color = alt.condition(alt.datum.count == 0, alt.value('lightblue'), 'count():O', legend = None)
    ).transform_filter(country_select)

# Selected country's launches per year.
year_title = alt.TitleParams('Number of Satellites Launched per Year')
year_bar_graph = alt.Chart(sat_df, title = year_title
    ).mark_bar().encode(
        y = alt.Y('Year of Launch:O', 
                axis = alt.Axis(orient = 'right'), scale = alt.Scale(reverse = True)),
        x = alt.X('count():Q', scale = alt.Scale(reverse = True)),
        text = alt.Text('count():Q'),
        color = alt.condition(alt.datum.count == 0, alt.value('blue'), 'count():O', legend = None)
    ).transform_filter(country_select)

country_bar_graph = country_bar_graph + country_bar_graph.mark_text(align = 'center', dy = -14, fontSize = 9)
operator_bar_graph = operator_bar_graph + operator_bar_graph.mark_text(align = 'center', dx = 14, fontSize = 9)
purpose_bar_graph = purpose_bar_graph + purpose_bar_graph.mark_text(align = 'center', dy = -14, fontSize = 9)
year_bar_graph = year_bar_graph + year_bar_graph.mark_text(align = 'center', dx = -14, fontSize = 9)

(country_bar_graph & (operator_bar_graph | purpose_bar_graph | year_bar_graph))

In [64]:
# sat_df[sat_df['Operator/Owner'].str.contains('National Aeronautics and Space')]
sat_df.loc[sat_df['Country of Operator/Owner'] == 'USA', 'Operator/Owner'].unique()

array(['Hera Systems', 'National Reconnaissance Office (NRO)',
       'US Air Force', 'Department of Homeland Security',
       'Aerospace Corporation',
       'Center for Atmospheric Sciences, Hampton University/NASA',
       'US Air Force Institute of Technology', 'SES S.A.',
       'SES S.A./Gogo', 'AMSAT-NA', 'ANDESITE - Boston University',
       'University of South Florida, Institute of Applied Engineering (IAE).',
       'Planetary Resources', '1Worldspace', 'Astranis', 'DirecTV, Inc.',
       'PointView Tech', 'Salish Kootenai College', 'BlackSky Global',
       'AST SpaceMobile', 'SpaceQuest, Ltd.',
       'Capital Technology University',
       'University of Louisiana at Lafayette', 'Capella Space',
       'Air Force Research Laboratory',
       'NASA Goddard Space Flight Center',
       'Defense Innovation Unit/Cesium Astro',
       'National Aeronautics and Space Administration (NASA) Goddard Space Flight Center',
       'University of Florida', 'GeoOptics Inc.', 'GeoOpti

In [65]:
sat_df['Age_Remaining'] = (sat_df['Year of Launch'] + sat_df['Expected Lifetime (yrs.)']) - 2023
sat_df.sort_values('Age_Remaining')

Unnamed: 0,"Name of Satellite, Alternate Names",Current Official Name of Satellite,Country/Org of UN Registry,Country of Operator/Owner,Operator/Owner,Users,Purpose,Detailed Purpose,Class of Orbit,Type of Orbit,...,Country of Contractor,Launch Site,Launch Vehicle,COSPAR Number,NORAD Number,Comments,alpha-3,Country_Code,Year of Launch,Age_Remaining
755,FLTSATCOM-8 (USA 46),USA 46,USA,USA,US Navy,Military,Communications,,GEO,,...,USA,Cape Canaveral,Atlas Centaur,1989-077A,20253,Old system replaced by UFO satellites; this sa...,USA,840.0,1989,-29.0
2565,SCD-1 (Sat�lite de Coleta de Dados),SCD-1,Brazil,Brazil,Instituto Nacional de Pesquisas Espaciais (INPE),Government,Earth Observation,Meteorology/Earth Science,LEO,Non-Polar Inclined,...,Brazil,Cape Canaveral,Pegasus,1993-009B,22491,Collects meteorological and environmental data...,BRA,76.0,1993,-27.0
2683,Skynet 4C,Skynet 4C,United Kingdom,United Kingdom,Intelsat/Paradigm Secure Communications (wholl...,Military,Communications,,GEO,,...,France/UK/Germany,Guiana Space Center,Ariane 44LP,1990-079A,20776,Spare. In March 2010 it was announced that the...,GBR,826.0,1990,-26.0
6298,"TDRS-3 (Tracking and Data Relay Satellite, TDR...",TDRS-3,USA,USA,National Aeronautics and Space Administration ...,Government,Communications,,GEO,,...,USA,Cape Canaveral,Space Shuttle (STS 26),1988-091B,19548,Backup; still partially operational.,USA,840.0,1988,-25.0
6440,"UFO-4 (USA 108, UFO F4 EHF) ""UHF Follow-On""",USA 108,USA,USA,US Navy,Military,Communications,,GEO,,...,USA,Cape Canaveral,Atlas 2,1995-003A,23467,Ultra-High Frequency (UHF) communications and ...,USA,840.0,1995,-24.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
6710,Zhuhai 1-06 (OHS-3),OHS-3,China,China,Zhuhai Orbita Aerospace Science and Technology...,Commercial,Earth Observation,Hyperspectral Imaging,LEO,Sun-Synchronous,...,China,Jiuquan Satellite Launch Center,Long March 11,2018-040D,43442,"Survey natural resources, cities, crops and fo...",CHN,156.0,2018,
6711,Zhuhai 1-07 (OHS-4),OHS-4,China,China,Zhuhai Orbita Aerospace Science and Technology...,Commercial,Earth Observation,Hyperspectral Imaging,LEO,Sun-Synchronous,...,China,Jiuquan Satellite Launch Center,Long March 11,2018-040E,43443,"Survey natural resources, cities, crops and fo...",CHN,156.0,2018,
6712,Ziyuan 1-02C,Ziyuan 1-02C,China,China,China Centre for Resources Satellite Data and ...,Government,Earth Observation,Optical Imaging,LEO,Sun-Synchronous,...,China,Taiyuan Launch Center,Long March 4B,2011-079A,38038,Can acquire high-resolution data through remot...,CHN,156.0,2011,
6716,Ziyuan 3-3,Ziyuan 3-3,China,China,China Centre for Resources Satellite Data and ...,Government,Earth Observation,Optical Imaging,LEO,Sun-Synchronous,...,China,Taiyuan Launch Center,Long March 4B,2020-051A,45939,Land survey satellite. Provide data for the co...,CHN,156.0,2020,


In [66]:
sat_df.iloc[1070]

Name of Satellite, Alternate Names    Hubble Space Telescope (HST, Space Telescope)
Current Official Name of Satellite                           Hubble Space Telescope
Country/Org of UN Registry                                                      USA
Country of Operator/Owner                                                       USA
Operator/Owner                                     European Space Agency (ESA)/NASA
Users                                                                    Government
Purpose                                                               Space Science
Detailed Purpose                                                                NaN
Class of Orbit                                                                  LEO
Type of Orbit                                                    Non-Polar Inclined
Longitude of GEO (degrees)                                                      0.0
Perigee (km)                                                                

In [67]:
(1990+10)-2023

-23

In [68]:
age_chart = alt.Chart(sat_df
  ).mark_boxplot().encode(
    x=alt.X('Purpose'),
    y=alt.Y('Age_Remaining'),
    color = 'Purpose:N',
    tooltip=['Country of Operator/Owner','Current Official Name of Satellite','Date of Launch']
)
age_chart

In [69]:
age_chart_2 = alt.Chart(sat_df
  ).mark_point().encode(
    x=alt.X('Date of Launch'),
    y=alt.Y('Age_Remaining'),
    color = 'Purpose:N',
    tooltip=['Country of Operator/Owner','Current Official Name of Satellite','Date of Launch']
)
age_chart_2

In [70]:
age_chart_3 = alt.Chart(sat_df
  ).mark_boxplot().encode(
    x=alt.X('Class of Orbit'),
    y=alt.Y('Age_Remaining'),
    color = 'Class of Orbit:N',
    tooltip=['Country of Operator/Owner','Current Official Name of Satellite','Date of Launch']
)
age_chart_3

In [71]:
age_chart_4 = alt.Chart(sat_df
  ).mark_boxplot().encode(
    x=alt.X('Country of Operator/Owner'),
    y=alt.Y('Age_Remaining'),
    color = 'Country of Operator/Owner:N',
    tooltip=['Country of Operator/Owner','Current Official Name of Satellite','Date of Launch']
)
age_chart_4

In [72]:
map_df = sat_df.groupby(['Country of Operator/Owner', 'Country_Code']).size().sort_values(ascending = False).reset_index(name = 'count')
display(map_df)

Unnamed: 0,Country of Operator/Owner,Country_Code,count
0,USA,840.0,4512
1,China,156.0,587
2,United Kingdom,826.0,561
3,Russia,643.0,177
4,Japan,392.0,88
...,...,...,...
67,Hungary,348.0,1
68,Nepal,524.0,1
69,Iraq,368.0,1
70,Jordan,400.0,1


In [73]:
countries = alt.topo_feature(data.world_110m.url, feature='countries')

# slider = alt.binding_range(min=0, max=1, step=0.05, name='opacity:')
# op_var = alt.param(value=0.1, bind=slider)

background = (alt.Chart(countries)
    # .transform_aggregate(
    #     groupby=['Country of Operator/Owner'],
    #     count = 'count()')
    .mark_geoshape(fill='lightgray', stroke='black', strokeWidth = 0.2)
    .properties(width=800, height=500)
    .encode(color='count:Q', tooltip=[alt.Tooltip(["id:N", 'Country of Operator/Owner'], title="Country")])
    .transform_lookup(lookup = 'Country_Code', from_ = alt.LookupData(data = map_df, key = 'Country_Code', fields = ['count']))
    .project('naturalEarth1')
    )
background

ValueError: Country of Operator/Owner encoding field is specified without a type; the type cannot be automatically inferred because the data is not specified as a pandas.DataFrame.

alt.Chart(...)

### Visualization Project Part 2: Sketching your Data
---
Your Module 1 discussion post identified some high-level goals for working with a dataset of interest to you. In this post, you will expand on those goals to characterize your target problem and develop some low-fidelity prototypes for working with that data. First, identify two to three tasks you would wish to complete with your data, identifying: 

1. Why is a task pursued? (goal)

2. How is a task conducted? (means)

3. What does a task seek to learn about the data? (characteristics)

4. Where does the task operate? (target data)

5. When is the task performed? (workflow)

6. Who is executing the task? (roles)

7. Then, sketch a set of preliminary low-fidelity prototypes for addressing these tasks with the given data. You may either sketch freeform or use the Five Design Sheets approach to generate these prototypes (hand-sketched on paper is fine). Upload a copy of your sketches as part of your post. 

### Part 2 Answers:

##### Preliminary Goals for this Data:

- Determine which country has the most satellites in orbit currently and execute a method that allows users to reach increased depths of information through this visualization, likely through grouping/aggregating features.
- Create a compelling way to identify orbital information about satellites. Look into different orbit features which may impact lifespan and/or see which orbiting altitudes are most congested.
- Feature engineer interesting statistics about the age of the satellites using life expectancy and launch date information.

### Visualization Project Part 3: A Plan for Evaluation
---
In your previous post, you identified a series of tasks and goals for your visualization as well as some preliminary design ideas. We’ll jump ahead a few steps and start to think about how we might evaluate our design approach. Outline a preliminary evaluation that addresses your core goals with the visualization. Make sure your evaluation discusses: 

The target question you want to answer

The people you would recruit to answer that question

The kinds of measures you would use to answer your data (e.g., insight depth, use cases, accuracy) and what these measures would tell you about the core question

The approach you will use to answer that question (e.g., a journaling study, a formal experiment, etc.)

How you would instantiate those methods (i.e., what would your participants do?)

What criteria would you use to indicate that your visualization was successful

### About the Final Project
---
Throughout the Modules, you have found a dataset, characterized the corresponding goals and tasks you want to conduct with that data, designed preliminary approaches, and outlined how you would evaluate those approaches. For your final project, you will put these ideas into practice by executing on the project plan outlined in your prior posts.

For this project, you will implement a visualization using your data from Module 1 and preliminary low-fidelity prototypes from Module 2 to address your stated goals. You may implement this visualization using either Altair or another platform of your choice. Once implemented, conduct your evaluation based on the plan outlined in your Module 3 discussion post, making sure to conduct your evaluation with at least three people. You may refine any of your prior plan to reflect your evolving understanding of the challenges you are addressing. Be sure to address how your plan has changed from these earlier posts as part of your discussion. 

Your final project post should include: 

A brief recap of your data, goals, and tasks, focusing on those that most directly influence your design

Screenshots of and/or a link to your visualization implementation (see below for additional guidance)

A summary of the key elements of your design and accompanying justification

A discussion of your final evaluation approach, including the procedure, people recruited, and results. Note that, due to the difficulty of recruiting experts, you can use colleagues, friends, classmates, or family to evaluate your designs if experts or others from your target population are unavailable. 

A synthesis of your findings, including what elements of your approach worked well and what elements you would refine in future iterations.

Guidance and platforms for deploying Altair visualizations online include: 

Altair: Interactive Plots on the Web

Add Animated Charts To Your Dashboards With Streamlit-Python

Creating Interactive Jupyter Notebooks and Deployment on Heroku Using Voila

