<a href="https://colab.research.google.com/github/GArdennes/SDG_Visualisation_Tutorial/blob/main/00_Environment_Setup_v1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Environment Setup - SDG Data Visualization Project

This notebook guides you through setting up your development environment for the SDG Data Visualization project. We'll install all necessary dependencies and verify your setup.

## Learning Objectives

By the end of this setup, you will have:
- A properly configured Python environment
- All required libraries installed
- Data access methods configured
- Version control setup verified

In [None]:
# Check Python version
import sys
print(f"Python version: {sys.version}")
# Extract major and minor version from the version string
version_info = sys.version_info
assert version_info >= (3, 8), "Python 3.8+ required"
print("✓ Python version check passed")

Python version: 3.11.13 (main, Jun  4 2025, 08:57:29) [GCC 11.4.0]
✓ Python version check passed


In [None]:
# Install required packages
import subprocess
import sys

def install_package(package):
    try:
        subprocess.check_call([sys.executable, "-m", "pip", "install", package])
        print(f"✓ {package} installed successfully")
    except subprocess.CalledProcessError as e:
        print(f"✗ Failed to install {package}: {e}")

# Core data science packages
packages = [
    "pandas>=1.3.0",
    "numpy>=1.21.0",
    "matplotlib>=3.4.0",
    "seaborn>=0.11.0",
    "plotly>=5.0.0",
    "requests>=2.25.0",
    "pytest>=6.0.0",
    "pydeck>=0.9.0"
]

for package in packages:
    install_package(package)

✓ pandas>=1.3.0 installed successfully
✓ numpy>=1.21.0 installed successfully
✓ matplotlib>=3.4.0 installed successfully
✓ seaborn>=0.11.0 installed successfully
✓ plotly>=5.0.0 installed successfully
✓ requests>=2.25.0 installed successfully
✓ pytest>=6.0.0 installed successfully
✓ pydeck>=0.9.0 installed successfully


In [None]:
# Verify installations
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly as px
import requests
import pydeck as pdk

print("✓ All core packages imported successfully")
print(f"Pandas version: {pd.__version__}")
print(f"NumPy version: {np.__version__}")
print(f"Plotly version: {px.__version__}")
print(f"Pydeck version: {pdk.__version__}")

✓ All core packages imported successfully
Pandas version: 2.2.2
NumPy version: 2.0.2
Plotly version: 5.24.1
Pydeck version: 0.9.1


In [None]:
!git clone https://github.com/GArdennes/SDG_Visualisation_Tutorial.git

In [None]:
%rm -rf SDG_Visualisation_Tutorial/00_Environment_Setup_v1.ipynb

## Data Access Setup

Configure access to SDG data sources and verify connectivity.

In [None]:
# Test UN SDG API access

urls = [
    "https://unstats.un.org/SDGAPI/v1/sdg/Goal/List", # Link for the list of goals and information regarding them
    'https://unstats.un.org/sdgs/UNSDGAPIV5/v1/sdg/DataAvailability/CountriesList', # Link for the list of countries with SDG data
    "https://restcountries.com/v3.1/all?fields=name", # Link for information regarding all the countries in the world
]

def test_sdg_api(url):
    try:
        response = requests.get(url, timeout=10)
        if response.status_code == 200:
            print("✓ API is accessible")
            data = response.json()
            # Uncomment to view outputs
            # if isinstance(data, list) and len(data) > 0 and all(isinstance(item, dict) for item in data):
            #     for item in data:
            #         for key, value in item.items():
            #             print(f"{key}: {value}\n")
            #     return True
            # else:
            #     print(f"✗ Unexpected response format: {data}")
            #     return False
            return True
        else:
            print(f"✗ API returned status {response.status_code}")
            return False
    except Exception as e:
        print(f"✗ API connection failed: {e}")
        return False

for url in urls:
    test_sdg_api(url)

✓ API accessible
✓ API accessible
✓ API accessible


In [None]:
# Verify project structure
import os

directories = [
    "SDG_Visualisation_Tutorial/data",
    "SDG_Visualisation_Tutorial/notebooks/lectures",
    "SDG_Visualisation_Tutorial/notebooks/exercises",
    "SDG_Visualisation_Tutorial/notebooks/labs",
]

for directory in directories:
    if os.path.exists(directory):
        print(f"✓ Directory exists: {directory}")
    else:
        print(f"✗ Directory does not exist: {directory}")
        os.makedirs(directory)
        print(f"✓ Directory created: {directory}")

✓ Directory exists: data
✓ Directory exists: notebooks/lectures
✓ Directory exists: notebooks/exercises
✓ Directory exists: notebooks/labs


You can access the directories you created using standard file system commands within a code cell. For example, you can use the `ls` command to list the contents of a directory:

You can also navigate through the directories using the file browser on the left sidebar of your Colab notebook.

In [None]:
URL = "https://hub.arcgis.com/api/v3/datasets/eae2def2bf3c4c01b9ff358fcd1a5a13_0/downloads/data?format=csv&spatialRefId=3857&where=1%3D1"

try:
    response = requests.get(URL)
    response.raise_for_status() # Raise an exception for bad status codes

    # Use StringIO to treat the string content as a file
    from io import StringIO
    data = StringIO(response.text)

    # Save CSV file for later reference
    file_path = "data/sdg_data.csv"
    with open(file_path, 'w', encoding='utf-8') as f:
        f.write(response.text)
    print(f"✓ Data saved successfully to {file_path}")


    # Load the data into a pandas DataFrame
    # df = pd.read_csv(data)
    # print("✓ Data loaded successfully into a DataFrame:")
    # print(df.head()) # Print the first few rows to verify


except requests.exceptions.RequestException as e:
    print(f"✗ Failed to load data from API: {e}")
except pd.errors.EmptyDataError:
    print("✗ The response from the API is empty.")
except Exception as e:
    print(f"An unexpected error occurred: {e}")

✓ Data saved successfully to data/raw/sdg_data.csv


## Environment Verification

Run comprehensive tests to ensure everything is working correctly.

In [None]:
# Environment verification test suite
def run_verification_tests():
    tests_passed = 0
    total_tests = 0

    # Test 1: Package imports
    total_tests += 1
    try:
        import pandas, numpy, matplotlib, seaborn, plotly, requests
        print("✓ Test 1: Package imports - PASSED")
        tests_passed += 1
    except ImportError as e:
        print(f"✗ Test 1: Package imports - FAILED: {e}")

    # Test 2: Data creation
    total_tests += 1
    try:
        df = pd.DataFrame({'x': [1, 2, 3], 'y': [4, 5, 6]})
        assert len(df) == 3
        print("✓ Test 2: Data creation - PASSED")
        tests_passed += 1
    except Exception as e:
        print(f"✗ Test 2: Data creation - FAILED: {e}")

    # Test 3: Visualization
    total_tests += 1
    try:
        fig = px.scatter(df, x='x', y='y')
        print("✓ Test 3: Visualization - PASSED")
        tests_passed += 1
    except Exception as e:
        print(f"✗ Test 3: Visualization - FAILED: {e}")

    print(f"\nVerification Results: {tests_passed}/{total_tests} tests passed")
    return tests_passed == total_tests

verification_passed = run_verification_tests()

## Success Criteria

Your environment is ready when:
- ✓ All package imports work without errors
- ✓ API is accessible
- ✓ Project directories are created
- ✓ Data iss loaded in directory

**Next Steps:** Proceed to `01_Lecture_Introduction_SDGs_v1.ipynb`