This is another datacamp project, here I'm investigation carbon emission.
The data is publicly available on [nature.com](https://www.nature.com/articles/s41597-022-01178-9), and contains product carbon footprints (PCFs) for various companies. PCFs are the greenhouse gas emissions attributable to a given product, measured in CO<sub>2</sub> (carbon dioxide equivalent).

This data is stored in a PostgreSQL database containing one table, `prouduct_emissions`, which looks at PCFs by product as well as the stage of production that these emissions occurred. Here's a snapshot of what `product_emissions` contains in each column:

### `product_emissions`

| field                              | data type |
|------------------------------------|-----------|
| `id`                                 | `VARCHAR`   |
| `year`                               | `INT`       |
| `product_name`                       | `VARCHAR`   |
| `company`                            | `VARCHAR`   |
| `country`                            | `VARCHAR`   |
| `industry_group`                     | `VARCHAR`   |
| `weight_kg`                          | `NUMERIC`   |
| `carbon_footprint_pcf`               | `NUMERIC`   |
| `upstream_percent_total_pcf`         | `VARCHAR`   |
| `operations_percent_total_pcf`       | `VARCHAR`   |
| `downstream_percent_total_pcf`       | `VARCHAR`   |



In [7]:
-- let's just take a look at the data
SELECT MIN(year), MAX(year)
FROM product_emissions;

Unnamed: 0,min,max
0,2013,2017


In [8]:
-- another look
SELECT MIN(weight_kg), MAX(weight_kg)
FROM product_emissions;

Unnamed: 0,min,max
0,0.00127,600000.0


In [11]:
-- let's find the minimum and maximum emissions for each country
SELECT MIN(weight_kg) AS min_kg, MAX(weight_kg) AS max_kg, country
FROM product_emissions 
GROUP BY country;

Unnamed: 0,min_kg,max_kg,country
0,1000.0,1000.0,Indonesia
1,0.02,1.093,Switzerland
2,0.14,1.0,Italy
3,0.090718,1.3,China
4,4.7,140000.0,Luxembourg
5,0.009,1000.0,Sweden
6,0.00127,1820.086,USA
7,1.0,1000.0,United Kingdom
8,1.0,1000.0,Netherlands
9,1.0,1500.0,Brazil


In [14]:
-- what years had the most emissions?

SELECT year, SUM(weight_kg) as sum_weight
FROM product_emissions 
GROUP BY year
ORDER BY sum_weight DESC;

Unnamed: 0,year,sum_weight
0,2015,2050463.0
1,2013,181328.2
2,2016,116869.8
3,2014,49212.73
4,2017,32597.91


In [15]:
-- What company emits most carbon overall?

SELECT company, SUM(weight_kg) as sum_weight
FROM product_emissions 
GROUP BY company
ORDER BY sum_weight DESC;

Unnamed: 0,company,sum_weight
0,"Gamesa Corporación Tecnológica, S.A.",1.961000e+06
1,Arcelor Mittal,1.400047e+05
2,Daimler AG,7.096000e+04
3,Volkswagen AG,3.321507e+04
4,Metsä Board,3.100000e+04
...,...,...
140,Retal,3.000000e-02
141,SK Hynix,2.520000e-02
142,Fabrica de Tapas Bavaria,4.700000e-03
143,Martin Bauer GmbH,4.000000e-03


In [4]:
-- Complete the query:
-- Find the number of unique companies and their total carbon footprint PCF for each industry group, filtering for the most recent year in the database. The query should return three columns: industry_group, num_companies, and total_industry_footprint, with the last column being rounded to one decimal place. The results should be sorted by total_industry_footprint from highest to lowest values.

-- Find most recent date:
SELECT MAX(year)
FROM product_emissions;

SELECT COUNT(DISTINCT company) as num_companies, ROUND(SUM(carbon_footprint_pcf),1) AS total_industry_footprint, industry_group
FROM product_emissions
WHERE year IN (SELECT MAX(year) FROM product_emissions)
GROUP BY industry_group
ORDER BY total_industry_footprint DESC;

Unnamed: 0,num_companies,total_industry_footprint,industry_group
0,3,107129.0,Materials
1,2,94942.7,Capital Goods
2,4,21865.1,Technology Hardware & Equipment
3,1,3161.5,"Food, Beverage & Tobacco"
4,1,740.6,Commercial & Professional Services
5,1,690.0,Software & Services
