## **Exploratory analysis of dynamism from the BSD**

**Research objectives**
- RQ1: How has the composition of UK firms evolved over the past decade according to the BSD?
- RQ2: To what extent has the rate of creative destruction in the UK declined between 1997 and 2023? 
- RQ3: How have gaps between the most productive ‘frontier’ firms and ‘laggard’ firms evolved? 
- RQ4: How are changes in business dynamism and productivity dispersion related?

**Data source**: Business Structure Database (1998-2023). Aggregated data tables have been exported from the UK Data Service SecureLab

### Executive Summary
- The business population has grown.
- Stable entry and exit. Entry was particularly strong between 2012 and 2018.

In [None]:
# Import packages and set filepaths
import pandas as pd
import numpy as np
import altair as alt
from pandas.api.types import CategoricalDtype
import os

import_path =
export_path =

#### **Summary of data tables**
##### *Table 1 - Population and job flows*
This table provides information on the business population each year and job flows.Index is the year. The following dimensions are provided:
- Total
- Firm size (employment)
- Firm age
- Sector
- Region
- Within-industry productivity decile

|Year|Dimension|Category|Number of firms|Employment|Turnover|Entrants|Exits|JC|JD|Multi-site firms|Multi-site emp|Site expansion|Site contraction|
|----|---------|--------|---------------|----------|--------|--------|-----|--|--|----------------|--------------|--------------|----------------|
|2000|Total|All|
|2000|Size|Micro|
|2000|Size|Small|
|2000|Size|Medium|
|2000|Size|Large|

##### *Table 2 - Cohort analysis*
This table looks at cohorts of firms starting in each year and tracks the entire cohort by age. The followning dimensions are provided:
- Total
- Sector
- Region
- Firm size (employment)

|Cohort|Age|Dimension|Category|Number of firms|Avg size|Survival rate|KM rate|Share of employment|Share of turnover|High growth firms|Stagnant firms|
|------|---|---------|--------|---------------|--------|-------------|-------|-------------------|-----------------|-----------------|--------------|
|2000|0|Total|All|
|2000|1|Total|All|
|2000|2|Total|All|
|2000|3|Total|All|
|2000|4|Total|All|

##### *Table 3 - Growth rates*

|Year|Dimension|Category|Number of firms|Employment|Turnover|Entrants|Exits|JC|JD|Multi-site firms|Multi-site emp|Site expansion|Site contraction|
|----|---------|--------|---------------|----------|--------|--------|-----|--|--|----------------|--------------|--------------|----------------|
|2000|Total|All|
|2000|Size|Micro|
|2000|Size|Small|
|2000|Size|Medium|
|2000|Size|Large|

##### *Table 4 - Productivity dispersion*

|Year|Dimension|Category|Number of firms|P10_Prod|P25_Prod|P50_Prod|Mean_Prod|P75_Prod|P90_Prod|SD_Prod|
|----|---------|--------|---------------|--------|--------|--------|---------|--------|--------|-------|
|2000|Total|All|
|2000|Size|Micro|
|2000|Size|Small|
|2000|Size|Medium|
|2000|Size|Large|

In [None]:
# Load data tables
population_df =
cohort_df = 
growth_df =
prod_df = 

<details>
<summary> View data preprocessing code</summary>

hi

</details>

#### **1. The composition of the UK business population**

First, we want to assess what types of firms make up the business population in 2023. Big or small, young or old. Which types of firms contribute the most to economic activity?

How has this changed over the last 20 years? Can we learn anything about structural change in the economy?

**Overall section findings**

In [None]:
# BSD facts - how has the total number of firms, employment and turnover changed over time?

total_population_df = population_df[population_df['Dimension']=='Total']

n_firm_chart = alt.Chart(total_population_df).mark_line().encode(
    x=alt.X('year:O', axis=alt.Axis(
                labelExpr="datum.value % 2 == 0 ? datum.label : ''",  # Show every 2nd year
            labelAngle=0)),
    y=alt.Y('n_firms:Q',title='Total number of firms in BSD', scale=alt.Scale(domainMin=1500000, domainMax=2500000),axis=alt.Axis(format=".2s"))
)

emp_chart = alt.Chart(total_population_df).mark_line().encode(
    x=alt.X('year:O', axis=alt.Axis(
                labelExpr="datum.value % 2 == 0 ? datum.label : ''",  # Show every 2nd year
            labelAngle=0)),
    y=alt.Y('employment:Q',title='Total employment in BSD', scale=alt.Scale(domainMin=15000000, domainMax=22000000), axis=alt.Axis(format=".2s"))
)

turnover_chart = alt.Chart(total_population_df).mark_line().encode(
    x=alt.X('year:O', axis=alt.Axis(
                labelExpr="datum.value % 2 == 0 ? datum.label : ''",  # Show every 2nd year
            labelAngle=0)),
    y=alt.Y('turnover:Q',title='Total turnover in BSD', scale=alt.Scale(domainMin=, domainMax=), axis=alt.Axis(format=".2s"))
)

productivity_chart = alt.Chart(total_population_df).mark_line().encode(
    x=alt.X('year:O', axis=alt.Axis(
                labelExpr="datum.value % 2 == 0 ? datum.label : ''",  # Show every 2nd year
            labelAngle=0)),
    y=alt.Y('turnover_per_employee:Q',title='Average turnover per employee in BSD', scale=alt.Scale(domainMin=0, domainMax=), axis=alt.Axis(format=".2s"))
)

basic_facts_chart = n_firm_chart | emp_chart | turnover_chart | productivity_chart

**Key Findings**
- The business population has expanded over the last 20 years, with substantial growth taking place between 2011 and 2018.

**Questions to explore**

In [None]:
# Write helper functions to create formatted tables and charts to explore subsequently

# Additional variables to calculate
- Average employees per firm
- Average turnover per employee

#### **2. Assessing the decline in business dynamism**
- Entry and exit rates
- Survival rates
- Growth rates
- Job reallocation rates

