# GVC participation rate

This notebook computes the trade-based and production-based GVC participation rates at the country-level and the 5-, 15-, and 35-sector levels. It uses the trade accounting framework of Borin and Mancini (2019). Results are saved in `data/`.

The trade-based GVC participation rate is defined as
```
GVCP_trade_f = (REX1 + REX2 + REX3 + REF1 + REF2) / Exports
GVCP_trade_f = (FVA + PDC1 + PDC2) / Exports
GVCP_trade = GVCP_trade_f + GVCP_trade_b
```
where sectors are broken down by export sectors.

The production-based GVC participation rate, meanwhile, is defined as
```
GVCP_prod = (DAVAX2 + REX1 + REX2 + REX3 + REF1 + REF2) / va
```
where sectors are broken down by origin sectors.

## Set up

In [1]:
import pandas as pd
import duckdb

In [2]:
ta, summary, output = 'ta.parquet', 'summary.parquet', 'gvcp.parquet'
# ta, summary, output = 'ta62.parquet', 'summary62.parquet', 'gvcp62.parquet'
# ta, summary, output = 'ta62-const.parquet', 'summary62-const.parquet', 'gvcp62-const.parquet'

## Load and process data

### Breakdown by export sectors

In [3]:
df_es = duckdb.sql(
    f"""
    (SELECT t, s, 0 AS agg, 0 AS i, 
        sum(Exports) AS Exports, 
        sum(DAVAX1) AS DAVAX1,
        sum(DAVAX2) AS DAVAX2,
        sum(REX1) AS REX1,
        sum(REX2) AS REX2,
        sum(REX3) AS REX3,
        sum(REF1) AS REF1,
        sum(REF2) AS REF2,
        sum(FVA) AS FVA,
        sum(PDC1) AS PDC1,
        sum(PDC2) AS PDC2
    FROM read_parquet('../data/{ta}') WHERE breakdown='none'
    GROUP BY t, s
    ORDER BY t, s)

    UNION ALL

    (SELECT t, s, 5 AS agg, i5 AS i, 
        sum(Exports) AS Exports, 
        sum(DAVAX1) AS DAVAX1,
        sum(DAVAX2) AS DAVAX2,
        sum(REX1) AS REX1,
        sum(REX2) AS REX2,
        sum(REX3) AS REX3,
        sum(REF1) AS REF1,
        sum(REF2) AS REF2,
        sum(FVA) AS FVA,
        sum(PDC1) AS PDC1,
        sum(PDC2) AS PDC2
    FROM read_parquet('../data/{ta}') WHERE breakdown='es'
    GROUP BY t, s, i5
    ORDER BY t, s, i5)

    UNION ALL
    
    (SELECT t, s, 15 AS agg, i15 AS i, 
        sum(Exports) AS Exports, 
        sum(DAVAX1) AS DAVAX1,
        sum(DAVAX2) AS DAVAX2,
        sum(REX1) AS REX1,
        sum(REX2) AS REX2,
        sum(REX3) AS REX3,
        sum(REF1) AS REF1,
        sum(REF2) AS REF2,
        sum(FVA) AS FVA,
        sum(PDC1) AS PDC1,
        sum(PDC2) AS PDC2
    FROM read_parquet('../data/{ta}') WHERE breakdown='es'
    GROUP BY t, s, i15
    ORDER BY t, s, i15)
    
    UNION ALL
    
    (SELECT t, s, 35 AS agg, i, 
        sum(Exports) AS Exports, 
        sum(DAVAX1) AS DAVAX1,
        sum(DAVAX2) AS DAVAX2,
        sum(REX1) AS REX1,
        sum(REX2) AS REX2,
        sum(REX3) AS REX3,
        sum(REF1) AS REF1,
        sum(REF2) AS REF2,
        sum(FVA) AS FVA,
        sum(PDC1) AS PDC1,
        sum(PDC2) AS PDC2
    FROM read_parquet('../data/{ta}') WHERE breakdown='es'
    GROUP BY t, s, i
    ORDER BY t, s, i)
    """
).df()

df_es['GVC_trade_f'] = df_es['REX1'] + df_es['REX2'] + df_es['REX3'] + df_es['REF1'] + df_es['REF2']
df_es['GVC_trade_b'] = df_es['FVA'] + df_es['PDC1'] + df_es['PDC2']
df_es['GVC_trade'] = df_es['GVC_trade_f'] + df_es['GVC_trade_b']
df_es['t'] = df_es['t'].astype(int)
df_es = df_es[['t', 's', 'agg', 'i', 'Exports', 'GVC_trade_f', 'GVC_trade_b', 'GVC_trade']]

### Breakdown by origin sectors

In [4]:
df_os = duckdb.sql(
    f"""
    (SELECT t, s, 0 AS agg, 0 AS i, 
        sum(Exports) AS Exports, 
        sum(DAVAX1) AS DAVAX1,
        sum(DAVAX2) AS DAVAX2,
        sum(REX1) AS REX1,
        sum(REX2) AS REX2,
        sum(REX3) AS REX3,
        sum(REF1) AS REF1,
        sum(REF2) AS REF2,
        sum(FVA) AS FVA,
        sum(PDC1) AS PDC1,
        sum(PDC2) AS PDC2
    FROM read_parquet('../data/{ta}') WHERE breakdown='none'
    GROUP BY t, s
    ORDER BY t, s)

    UNION ALL

    (SELECT t, s, 5 AS agg, i5 AS i, 
        sum(Exports) AS Exports, 
        sum(DAVAX1) AS DAVAX1,
        sum(DAVAX2) AS DAVAX2,
        sum(REX1) AS REX1,
        sum(REX2) AS REX2,
        sum(REX3) AS REX3,
        sum(REF1) AS REF1,
        sum(REF2) AS REF2,
        sum(FVA) AS FVA,
        sum(PDC1) AS PDC1,
        sum(PDC2) AS PDC2
    FROM read_parquet('../data/{ta}') WHERE breakdown='os'
    GROUP BY t, s, i5
    ORDER BY t, s, i5)

    UNION ALL
    
    (SELECT t, s, 15 AS agg, i15 AS i, 
        sum(Exports) AS Exports, 
        sum(DAVAX1) AS DAVAX1,
        sum(DAVAX2) AS DAVAX2,
        sum(REX1) AS REX1,
        sum(REX2) AS REX2,
        sum(REX3) AS REX3,
        sum(REF1) AS REF1,
        sum(REF2) AS REF2,
        sum(FVA) AS FVA,
        sum(PDC1) AS PDC1,
        sum(PDC2) AS PDC2
    FROM read_parquet('../data/{ta}') WHERE breakdown='os'
    GROUP BY t, s, i15
    ORDER BY t, s, i15)
    
    UNION ALL
    
    (SELECT t, s, 35 AS agg, i, 
        sum(Exports) AS Exports, 
        sum(DAVAX1) AS DAVAX1,
        sum(DAVAX2) AS DAVAX2,
        sum(REX1) AS REX1,
        sum(REX2) AS REX2,
        sum(REX3) AS REX3,
        sum(REF1) AS REF1,
        sum(REF2) AS REF2,
        sum(FVA) AS FVA,
        sum(PDC1) AS PDC1,
        sum(PDC2) AS PDC2
    FROM read_parquet('../data/{ta}') WHERE breakdown='os'
    GROUP BY t, s, i
    ORDER BY t, s, i)
    """
).df()

df_os['GVC_prod'] = df_os['DAVAX2'] + df_os['REX1'] + df_os['REX2'] + df_os['REX3'] + df_os['REF1'] + df_os['REF2']
df_os['t'] = df_os['t'].astype(int)
df_os = df_os[['t', 's', 'agg', 'i', 'GVC_prod']]

### Value added

In [5]:
va = duckdb.sql(
    f"""
    (SELECT t, s, 0 AS agg, 0 AS i, sum(va) AS va, 
    FROM read_parquet('../data/{summary}')
    GROUP BY t, s
    ORDER BY t, s)

    UNION ALL

    (SELECT t, s, 5 AS agg, i5 AS i, sum(va) AS va, 
    FROM read_parquet('../data/{summary}')
    GROUP BY t, s, i5
    ORDER BY t, s, i5)

    UNION ALL

    (SELECT t, s, 15 AS agg, i15 AS i, sum(va) AS va, 
    FROM read_parquet('../data/{summary}')
    GROUP BY t, s, i15
    ORDER BY t, s, i15)

    UNION ALL

    (SELECT t, s, 35 AS agg, i, sum(va) AS va, 
    FROM read_parquet('../data/{summary}')
    GROUP BY t, s, i
    ORDER BY t, s, i)
    """
).df()

In [6]:
df_os = pd.merge(df_os, va)

ValueError: You are trying to merge on int64 and object columns. If you wish to proceed you should use pd.concat

## Consolidate and save

In [None]:
df = pd.merge(df_es, df_os)
df['GVCP_trade_f'] = df['GVC_trade_f'] / df['Exports']
df['GVCP_trade_b'] = df['GVC_trade_b'] / df['Exports']
df['GVCP_trade'] = df['GVC_trade'] / df['Exports']
df['GVCP_prod'] = df['GVC_prod'] / df['va']

df = df[[
    't', 's', 'agg', 'i', 'Exports', 'va',
    'GVC_trade_f', 'GVC_trade_b', 'GVC_trade', 'GVC_prod',
    'GVCP_trade_f', 'GVCP_trade_b', 'GVCP_trade', 'GVCP_prod'
]]

df.to_parquet(f'../data/{output}', index=False)

### View results

In [None]:
duckdb.sql(f"SELECT * FROM read_parquet('../data/{output}')").df()