# Group Project part 01

#### Deadline for the code submission: October 10th at 08:59 am CET

#### Reminder
- your group is the one assigned to you by the University.
- one goal of this project is to learn how to work as a group, which is the standard in the tech industry. Therefore you need to resolve group issues on your own, as a group.
- if you did not manage to resolve the group issues on your own, you need to escalate to the teacher early, not last minute.
- if the group splits, it would result in a 0 for the whole group.

**Penalty for unexcused absence or lateness**:
- If you are absent or late on presentation day without an official excuse, you will receive 0 for the presentation part of the group project.
- If you are late without an official excuse and can still make it to the presentation of your team, you will still receive 0 for the presentation part of the group project.

## Objective
In this project, you utilise your skills to :
- collect data through multiple APIs and open source datasets, for both quantitative and qualitative data
- merge data from different sources
- describe and analyse datasets
- uncover patterns, insights
- calculate aggregated measures, statistics
- create compelling data visualisations
- write clean code
- tell a story and convince your audience

Each group can pick one and one only scenario among the following ones.

Be mindful to pick a topic that enables enough data collection and analysis in order to showcase all the skills gathered during the course, listed above.

### Scenario 01: Become a Business Manager

Your task is to design a local business that leverages data from various APIs to make informed, strategic decisions. Whether you're launching a street food stand, a drink shop, or another local venture, your team will gather and analyze relevant data —such as foot traffic, weather patterns, customer trends, or competitor insights— to shape your business plan. Your final deliverable will be a data-supported report and/or presentation to a management board, demonstrating how your findings guide key decisions in operations, marketing, or product offerings. The ultimate goal: to optimize performance and increase the chances of business success. Will your business thrive in today’s data-driven world?
Examples:
- lemonade stands business
- food truck business
- delivery service

### Scenario 02: Fact Check Popular beliefs

You are part of a fact-checking research team investigating common beliefs, trending opinions, or viral social media claims (e.g. “drinking lemon water boosts metabolism” or “blue light ruins your sleep”). Your goal is to dig into reliable sources, data, and expert opinions to determine whether these beliefs hold up under scrutiny. Use data to challenge or prove real-world claims with clear, persuasive insights. Drawing on research, statistics, and visual evidence, your team will present a well-supported explanation to help your audience separate fact from fiction.

You may also choose to divide the group into two sides—one defending the belief and the other challenging it—before presenting your findings in a debate or side-by-side analysis.

Examples:
- Electric cars are always better for the environment
- Areas with more green space have better physical and mental health outcomes.
- Does public sentiment on social media predict stock market trends?

## 01 - Getting Ready: first questions

Depending on the scenario you picked, please consider the following questions to help you get started.

### Scenario 01: Become a Business Manager

   - What kind of business do we run? What do we sell ? The choice of the business must be original and unique to your group.
   - How do we name our business?
   - When do we operate? Is it an all-year-round business or a seasonal one? If so, which seasons? Which months / weeks / days / hours of the day do we operate?
   - Where do we operate? In which countries / cities are we currently active ? Where do we want to develop in the future ? Determine where to set up your business stand based on weather conditions, local attractions, or events.
   - Which datasets will assist us in making our business the most successful?

### Scenario 02: Fact-Check a popular belief

•⁠  ⁠What specific belief or claim do you want to investigate ?

We aim to investigate the correlation between government funds spend on combatting climate change and the actual effect of these funds measured in the CO2 emission of countries over time.

•⁠  ⁠Why is this belief important or worth fact-checking ?

This belief matters, because a significant amount of public funding is going toward climate change initiatives. Understanding whether these initiatives are truly effective helps ensure accountability, smart use of resources, and real environmental progress. Fact-checking this belief can also shape how policies are developed and whether the public supports future climate budgets.


•⁠  ⁠What evidence or data supports or contradicts the belief ?

Data from the OECD and the World Bank show a mixed picture. In some countries, higher government spending on climate programs has led to lower emissions and more renewable energy use. But in others, similar investments haven’t made much difference. This suggests that money alone isn’t always enough -> how it’s used matters. Because of this, it’s too early to take a clear stance, and both datasets are needed to understand the full story.


•⁠  ⁠Will you split the team into two group (in favor / against) ?

Yes, this topic is so complex that we will have Pro and Contra arguments in our final product.

•⁠  ⁠What real-world impact does this belief have on people ?

This belief shapes how people see the government's role in fighting climate change and most importantly, how willing they are to support it through taxes or public programs. It also affects things like job opportunities in clean energy, the cost of electricity, and how quickly we can move toward a low-carbon future.

•⁠  ⁠What are the consequences if people continue believing or acting on this (true or false) idea ?

True: Ongoing funding could speed up climate action, protect people’s health, and help build a stronger, more sustainable economy.

False: Money could be wasted, public trust could take a hit, and real progress on climate solutions might be delayed -> leading to even greater environmental and economic costs down the line.

## 02 - Collect data from multiple APIs, the more the merrier

Integrate with as many APIs as you can e.g.:
- OpenWeatherMap API
- Google Maps,
- TripAdvisor,
- News API,
- Yelp,
- Wikipedia,
- Booking,
- Amadeus Travel API,
- Foursquare,
- etc. (make your own research and be original!)

Each API can provide different types of information. Pick the ones that best suit your scenario.

After collecting all the data you need, save them.

# Using the world bank api 
to retrieve a rich dataset of quanitive data including GHG-emissions, financial data on climate funds budget and spendig as well as GDP and demografic data.

In [None]:
import wbgapi as wb
import pandas as pd

energy_indicators = [
    "EG.EGY.PRIM.PP.KD", "EG.ELC.ACCS.RU.ZS", "EG.ELC.ACCS.UR.ZS", "EG.ELC.ACCS.ZS",
    "EG.ELC.COAL.ZS", "EG.ELC.FOSL.ZS", "EG.ELC.HYRO.ZS", "EG.ELC.LOSS.ZS",
    "EG.ELC.NGAS.ZS", "EG.ELC.NUCL.ZS", "EG.ELC.PETR.ZS", "EG.ELC.RNEW.ZS",
    "EG.ELC.RNWX.KH", "EG.ELC.RNWX.ZS", "EG.FEC.RNEW.ZS", "EG.GDP.PUSE.KO.PP",
    "EG.GDP.PUSE.KO.PP.KD", "EG.IMP.CONS.ZS", "EG.USE.COMM.CL.ZS", "EG.USE.COMM.FO.ZS",
    "EG.USE.COMM.GD.PP.KD", "EG.USE.CRNW.ZS", "EG.USE.ELEC.KH.PC", "EG.USE.PCAP.KG.OE"
]

worldb_energy_df = wb.data.DataFrame(
        energy_indicators,
        time=range(2000, 2020),
        skipBlanks=True,
        labels=True,
        columns='series'
    ).reset_index()

worldb_energy_df.head(50)

ModuleNotFoundError: No module named 'wbgapi'

In [None]:
envghg_indicators = [
    "EN.CLC.DRSK.XQ", "EN.CLC.MDAT.ZS",
    "EN.GHG.ALL.LU.MT.CE.AR5", "EN.GHG.ALL.MT.CE.AR5", "EN.GHG.ALL.PC.CE.AR5", "EN.GHG.CH4.AG.MT.CE.AR5",
    "EN.GHG.CO2.AG.MT.CE.AR5", "EN.GHG.CO2.BU.MT.CE.AR5", "EN.GHG.CO2.FE.MT.CE.AR5",
    "EN.GHG.CO2.IC.MT.CE.AR5", "EN.GHG.CO2.IP.MT.CE.AR5", "EN.GHG.CO2.LU.DF.MT.CE.AR5", "EN.GHG.CO2.LU.FL.MT.CE.AR5",
    "EN.GHG.CO2.LU.MT.CE.AR5", "EN.GHG.CO2.LU.OL.MT.CE.AR5", "EN.GHG.CO2.LU.OS.MT.CE.AR5", "EN.GHG.CO2.MT.CE.AR5",
    "EN.GHG.CO2.PC.CE.AR5", "EN.GHG.CO2.PI.MT.CE.AR5", "EN.GHG.CO2.RT.GDP.KD", "EN.GHG.CO2.RT.GDP.PP.KD",
    "EN.GHG.CO2.TR.MT.CE.AR5", "EN.GHG.CO2.WA.MT.CE.AR5", "EN.GHG.CO2.ZG.AR5", "EN.GHG.FGAS.IP.MT.CE.AR5",
    "EN.GHG.TOT.ZG.AR5"
]

worldb_envghg_df = wb.data.DataFrame(
        envghg_indicators,
        time=range(2000, 2020),
        skipBlanks=True,
        labels=True,
        columns='series'
    ).reset_index()

worldb_envghg_df.head(50)

In [None]:
envfin_indicators = [
    "NY.GDP.MKTP.CD",
    "NY.GDP.MKTP.CN",
    "NY.GDP.MKTP.KD.ZG",
    "SP.POP.TOTL",
    "NY.ADJ.DCO2.CD",
    "NY.ADJ.DCO2.GN.ZS",
    "NY.ADJ.DPEM.CD",
    "NY.ADJ.DPEM.GN.ZS",
    "DT.NFL.UNEP.CD"
]

worldb_envfin_df = wb.data.DataFrame(
        envfin_indicators,
        time=range(2000, 2020),
        skipBlanks=True,
        labels=True,
        columns='series'
    ).reset_index()

worldb_envfin_df.head(50)

# Using the Wikipedia API for qualitative context
We will use the Wikipedia API to gather qualitative background information that complements our quantitative World Bank datasets. Collecting this contextual data will help explain trends and provide narratives about how and why emissions changed in specific countries.

In [None]:
import wikipedia

wikipedia.set_lang('en')

topics = [
"Climate change",
"Climate change mitigation",
"Climate change adaptation",
"United Nations Framework Convention on Climate Change",
"Kyoto Protocol",
"Paris Agreement",
"Nationally determined contribution",
"Loss and damage (climate change)",
"Loss and Damage Fund",
"Climate finance",
"Green Climate Fund",
"The Adaptation Fund",
"Global Environment Facility",
"European Union Emissions Trading System",
"Emissions trading",
"Carbon price",
"Carbon tax",
"Carbon emission trading",
"EU Carbon Border Adjustment Mechanism",
"European Green Deal",
"Fit for 55",
"Carbon budget",
"Global Carbon Project",
"List of countries by carbon dioxide emissions",
"List of countries by carbon dioxide emissions per capita",
"List of countries by greenhouse gas emissions",
"List of countries by greenhouse gas emissions per capita",
"List of countries by carbon intensity",
"Fossil fuel subsidies",
"Greenhouse gas emissions from agriculture",
"Climate Change Performance Index",
"Climate change in Germany"
]

articles = []

for t in topics:
    try:
        s = wikipedia.summary(t)
    except Exception:
        s = None
    articles.append({'title': t, 'extract': s})
for a in articles:
    print('===', a['title'], '===')
    print(a['extract'])
    print()

## 03 - Collect data from open source data sources

The dataset must align with your end-goal and serve its purpose.

- governement websites
- statistics institutes,
- etc.

At the end of this step, you should have collected both quantitative and qualitative data.