# Create lithium supply scenarios using *minerals_supply_scenarios*

This notebook creates the lithium supply scenarios based on S&P database. The following scenarios are created:
- Baseline
- Ambitious
- Very Ambitious

The development of the supply scenarios is facilitated by the *minerals_supply_scenarios* Python library, designed to assist in the development of scenarios for minerals supply based on asset-level data. The tool streamlines the process of importing, processing, and analyzing mining asset data from the S&P Capital IQ Pro database. By default, supply scenarios are created considering the development stage of the mining projects, thus reflecting different levels of production expansion. Scenario data is exported as a scenario data file that is integrated into *premise*.

Get the tool from: https://github.com/robyistrate/minerals_supply_scenarios

In [None]:
# Add minerals_supply_scenarios to path and import
import sys
import os
ROOT_DIR = os.path.abspath(os.path.join("../../minerals_supply_scenarios"))
if ROOT_DIR not in sys.path:
    sys.path.append(ROOT_DIR)

import minerals_supply_scenarios as mss

In [None]:
import pandas as pd
import numpy as np
import yaml
from pathlib import Path
import datetime

SCENARIO_DATA_PATH = Path("../scenario_data")
RESULTS_PATH = Path("../results")

## Create scenario data

In [2]:
SP_DATASET_PATH = SCENARIO_DATA_PATH / "external" / "SPGlobal_Export_12-23-2024_495bfe45-28c9-496d-8925-0e8ecf01c999.xls"
scenario_timeframe = (2020,2035,1)
list_of_years = [str(year) for year in range(scenario_timeframe[0], scenario_timeframe[1]+1, scenario_timeframe[2])]

lithium_scenarios = mss.MetalSupplyScenarios(
    commodity="Lithium",
    dataset_path=SP_DATASET_PATH,
    timeframe=scenario_timeframe,
    specifics={"Deposit Type": ["Brine (Salar)"]},
    exclude={"Country": ["Canada"], 
             "Deposit Type": ["Brine (Salar), Pegmatite Hosted"]},
    export_dir=SCENARIO_DATA_PATH / "external"
    )

Considering only: {'Deposit Type': ['Brine (Salar)']}
Excluding: {'Country': ['Canada'], 'Deposit Type': ['Brine (Salar), Pegmatite Hosted']}
****************************************
Creating supply scenarios for lithium
Importing raw dataset from S&P database...
Applying strategy: fill_data_gaps
Applying strategy: estimate_future_production
Applying strategy: create_scenarios_data
Applying strategy: harmonize_production_data
Exporting premise scenario data file
*****************************
Processing report
Number projects in updated dataset : 143
Number projects with production in updated dataset : Production 2020     8
Production 2021     8
Production 2022     9
Production 2023    12
Production 2024    19
Production 2025    22
Production 2026    24
Production 2027    27
Production 2028    29
Production 2029    29
Production 2030    34
Production 2031    34
Production 2032    34
Production 2033    34
Production 2034    34
Production 2035    34
dtype: int64
Number projects in scenari

In [3]:
# Explore discarded projects
all_projects = list(set(lithium_scenarios.sp_dataset_updated.index))
considered_projects = list(set(lithium_scenarios.scenario_data["Project ID"]))
excluded_projects = list(set(all_projects) - set(considered_projects))

In [4]:
print("Number of projects discarded:", len(all_projects) - len(considered_projects))
print("% of projects discarded:", round(len(excluded_projects) *100 / len(all_projects)), "%")

Number of projects discarded: 109
% of projects discarded: 76 %


In [5]:
# Calculate % of projects excluded by development stage
dev_stage_discarded_projects = lithium_scenarios.sp_dataset_updated.loc[excluded_projects].groupby(
    'Development Stage').size().sort_values(ascending=False).round(0).reset_index(name='Number')
dev_stage_discarded_projects["Share"] = dev_stage_discarded_projects["Number"].divide(len(excluded_projects)).multiply(100).round(0)

dev_stage_discarded_projects

Unnamed: 0,Development Stage,Number,Share
0,Target Outline,46,42.0
1,Exploration,29,27.0
2,Reserves Development,12,11.0
3,Grassroots,9,8.0
4,Prefeas/Scoping,7,6.0
5,Limited Production,2,2.0
6,Advanced Exploration,1,1.0
7,Commissioning,1,1.0
8,Preproduction,1,1.0
9,Satellite,1,1.0


## Create "other projects" category based on available LCI datasets

Projects for which LCI datasets are not available are aggregated under the category "Others"

In [6]:
lithium_scenario_data = lithium_scenarios.premise_scenario_data.copy()

In [21]:
# Import the map between project names and LCI dataset
map_lithium_projects_to_lci = pd.read_excel(SCENARIO_DATA_PATH / "map_litihum_projects_to_lci.xlsx")
map_lithium_projects_to_lci.head()

Unnamed: 0,Project name,Country,Technology,LCI dataset,Relevance for LCI modelling,Li concentration,Other comments
0,Cauchari-Olaroz,AR,Evaporative ponds,df_rotary_dryer_Salar de Cauchari-Olaroz,DONE,0.0005,
1,Chaerhan Lake,CN,DLE,df_rotary_dryer_Chaerhan,DONE,0.000662,
2,Cuenca Centenario-Ratones,AR,DLE,df_rotary_dryer_Salar de Centenario,DONE,0.000309,
3,East Taijinair,CN,DLE,df_rotary_dryer_East Taijinar,DONE,0.0008,https://rslithium.com/research-report-on-lithi...
4,Fort Cady,US,Modelled as average,Other projects,NO; less than 1% of global production → Model ...,n.a.,


In [22]:
# Check that all projects are in the mapping file and the other way around
print("Projects not in the mapping file:")
for index, row in lithium_scenario_data.iterrows():
    project_var = row["variables"]
    project_name = project_var.split('|')[-1]

    if project_name not in list(map_lithium_projects_to_lci["Project name"]):
        print(project_name)

print("Projects not in the scenario:")
for project_name in list(map_lithium_projects_to_lci["Project name"]):
    if project_name not in [i.split('|')[-1] for i in list(set(lithium_scenario_data["variables"]))]:
        print(project_name)

Projects not in the mapping file:
Projects not in the scenario:


In [23]:
other_projects_variables = []
other_projects_production = []

for index, row in lithium_scenario_data.iterrows():
    project_var = row["variables"]
    project_name = project_var.split('|')[-1]
    project_lci = map_lithium_projects_to_lci[map_lithium_projects_to_lci["Project name"] == project_name]["LCI dataset"].values[0]
        
    if project_lci == "Other projects":
        other_projects_variables.append(project_var)
        other_projects_production.append([row["scenario"], row["region"], row["variables"], row["unit"]] + [row[year] for year in list_of_years]
                    )
        
# Add other projects to scenario data:
other_projects_production_df = pd.DataFrame(other_projects_production, columns=lithium_scenario_data.columns)
other_projects_production_df = other_projects_production_df.groupby(['scenario', 'region', "unit"]).sum().reset_index()
other_projects_production_df["variables"] = "Production|Lithium|Brine (Salar)|Other projects"
lithium_scenarios_premise_adjusted = pd.concat([lithium_scenario_data, other_projects_production_df], ignore_index=True)
other_project_variables = list(set(other_projects_variables))
lithium_scenarios_premise_adjusted = lithium_scenarios_premise_adjusted[~lithium_scenarios_premise_adjusted['variables'].isin(other_project_variables)]
lithium_scenarios_premise_adjusted = lithium_scenarios_premise_adjusted.sort_values(by="scenario").reset_index(drop=True)

In [24]:
list(set(lithium_scenarios_premise_adjusted["region"]))

['BO', 'US', 'CL', 'AR', 'CN', 'DE']

In [25]:
# Export scenario data file for use in premise
lithium_scenarios_premise_adjusted[["scenario", "region", "variables", "unit", "2020", "2025", "2030", "2035"]].to_csv(
    SCENARIO_DATA_PATH / "external" / f"lithium_scenario_data_with_others_{datetime.datetime.today().strftime('%d-%m-%Y')}.csv", index=False)

## Generate scenario results

Results for Figure 2 in the main manuscript, including production per country, project, and total

In [26]:
# First, annonymize project names. Each project name is assigned a
# unique number (except for names containing "Other projects," which should be assigned as "Others")
projects_id_path = SCENARIO_DATA_PATH / "external" / "anonymized_projects_id.yaml"

# Check if the file already exists
if projects_id_path.exists():
    # If the file exists, open and load the existing content
    print("Opnening existing annonymized projects ID list")
    with open(projects_id_path, "r") as yaml_file:
        project_ids = yaml.safe_load(yaml_file)
else:
    # If the file does not exist, create a new mapping
    print("Creating the annonymized projects ID list")
    project_ids = {}

    counter = 1
    for proj in list(set(lithium_scenarios_premise_adjusted["variables"])):
        if "Other projects" in proj:
            project_ids[proj] = "Others"
        else:
            project_ids[proj] = f"#{counter}"
            counter += 1

    # Save the mapping as a new YAML file
    with open(projects_id_path, "w") as yaml_file:
        yaml.dump(project_ids, yaml_file, default_flow_style=False)

Opnening existing annonymized projects ID list


In [27]:
# Production by country in each scenario
lithium_production_by_country = lithium_scenarios_premise_adjusted[["scenario", "region"] + [year for year in list_of_years]].groupby(["scenario", "region"]).sum()

# Total production in each scenario
total_lithium_production = lithium_production_by_country.groupby(level='scenario').sum()

# Production shares by project in each scenario
production_by_project = lithium_scenarios_premise_adjusted[["scenario", "variables"] + [year for year in list_of_years]].groupby(["scenario", "variables"]).sum().reset_index().rename(columns={"variables": "project"})
production_by_project['project'] = production_by_project['project'].replace(project_ids)
production_by_project.set_index(['scenario', 'project'], inplace=True)
sums_by_scenario_project = production_by_project.groupby('scenario').sum()
production_share_by_project = production_by_project.div(sums_by_scenario_project)

# Production by technology
production_by_technology = []
for index, row in lithium_scenarios_premise_adjusted.iterrows():
    project_name = row["variables"].split('|')[-1]
    project_technology = map_lithium_projects_to_lci[map_lithium_projects_to_lci["Project name"] == project_name]["Technology"].values

    if len(project_technology) == 0:
        project_technology = ["Modelled as average"]

    production_entry = (row["scenario"], project_technology[0]) + tuple(row[list_of_years])
    production_by_technology.append(production_entry)

production_by_technology_df = pd.DataFrame(production_by_technology, columns=["scenario", "technology"] + list_of_years)
production_by_technology_df = production_by_technology_df.groupby(['scenario', 'technology']).sum()

In [28]:
# Export results
lithium_production_by_country.to_csv(RESULTS_PATH / f"lithium_production_by_country_{datetime.datetime.today().strftime('%d-%m-%Y')}.csv")
total_lithium_production.to_csv(RESULTS_PATH / f"lithium_production_total_{datetime.datetime.today().strftime('%d-%m-%Y')}.csv")
production_share_by_project.to_csv(RESULTS_PATH / f"lithium_production_share_by_project_{datetime.datetime.today().strftime('%d-%m-%Y')}.csv")
production_by_technology_df.to_csv(RESULTS_PATH / f"lithium_production_by_technology_{datetime.datetime.today().strftime('%d-%m-%Y')}.csv")