# Investigating the frequency of newly incorporated companies with the same post code in the UK

<br>

In [1]:
# importing numpy.
import numpy as np

# importing pandas.
import pandas as pd

# importing regular expressions.
import re

***

<br>

##### Reading in 500 companies incorporated in 2022 from Companies House

First 500 companies incorporated in the UK in 2022.  Search function available [here](https://find-and-update.company-information.service.gov.uk/advanced-search) [1]. 

In [2]:
data = "data\comp_house-13012022.csv"

In [3]:
# Reading in CSV exported from Companies House.
comp_500 = pd.read_csv(data)

***

<br>

#### Creating Regex for post code

The regular expression for UK post codes was obtained [here](https://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Validation) [2]. 

In [4]:
re_post = "([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|\
(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([A-Za-z][0-9][A-Za-z])|\
([A-Za-z][A-Ha-hJ-Yj-y][0-9][A-Za-z]?))))\s?[0-9][A-Za-z]{2})"

***

<br>

#### Searching post codes within the address

Running regular expression on addresses and adding post code to list

In [5]:
post = []

for i in comp_500["registered_office_address"]:
    match = re.search(re_post, i).group(0)
    post.append(match)

***

<br>

#### Adding post code as seperate column

In [6]:
comp_500["Post Code"] = post

***

<br>

#### Adding frequency of the post code as a column

In [7]:
# count of the 'Post Code' column and returning the count as a 'frequency' column.  Transform function applies the count to each row. [3]
comp_500['PO frequency'] = comp_500.groupby("Post Code")["Post Code"].transform('count')

***

<br>

#### Displaying dataframe

Displaying top 50 rows in decending order by 'frequency'.

In [8]:
pd.set_option("display.max_rows", None, "display.max_columns", None)

comp_500.sort_values('PO frequency', ascending=False).head(100)

Unnamed: 0,company_name,company_number,company_status,company_type,dissolution_date,incorporation_date,nature_of_business,registered_office_address,Post Code,PO frequency
205,CHANGING LIVES SUPPORT LTD,13845879,Active,Private limited by guarantee without share cap...,,13/01/2022,86900 88990,71-75 Shelton Street Covent Garden London WC2H...,WC2H 9JQ,28
193,MOUAD LTD,13845828,Active,Private limited company,,13/01/2022,47710,71-75 Shelton Street Covent Garden London WC2H...,WC2H 9JQ,28
197,AION COMMUNICATIONS LTD,13846365,Active,Private limited company,,13/01/2022,61900,71-75 Shelton Street Covent Garden London WC2H...,WC2H 9JQ,28
204,COMPLEX ADAPTIVE SOFTWARE SYSTEMS LTD,13845881,Active,Private limited company,,13/01/2022,58290 62011 62012 72190,71-75 Shelton Street Covent Garden London WC2H...,WC2H 9JQ,28
198,ANIJIE GLOBAL LTD,13846367,Active,Private limited company,,13/01/2022,96090,71-75 Shelton Street Covent Garden London WC2H...,WC2H 9JQ,28
199,SEMPER EXCELSIUS LTD,13846366,Active,Private limited company,,13/01/2022,68100 68209,71-75 Shelton Street Covent Garden London WC2H...,WC2H 9JQ,28
200,ANALYTICR LTD,13846386,Active,Private limited company,,13/01/2022,96090,71-75 Shelton Street Covent Garden London WC2H...,WC2H 9JQ,28
211,NEXT STEP TECH SERVICES LTD,13845676,Active,Private limited company,,13/01/2022,62020 62090,71-75 Shelton Street Covent Garden London WC2H...,WC2H 9JQ,28
210,ADMEP LTD,13845675,Active,Private limited company,,13/01/2022,62012 62020 62090,71-75 Shelton Street Covent Garden London WC2H...,WC2H 9JQ,28
209,LOSOMO STUDIO LIMITED,13845672,Active,Private limited company,,13/01/2022,96090,71-75 Shelton Street Covent Garden London WC2H...,WC2H 9JQ,28


***

<br>

## References

1. https://find-and-update.company-information.service.gov.uk/advanced-search
2. https://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Validation
3. https://stackoverflow.com/questions/22391433/count-the-frequency-that-a-value-occurs-in-a-dataframe-column

***

# End