# Exploring naturalization records in Archives New Zealand

Series 8333 in Archives New Zealand is the central filing series for the NZ Department of Internal Affairs and
includes many applications for naturalization. The series is too large to harvest in full, given the current limitations of Archway, the online database. To create a harvestable dataset, we searched for the keywords 'naturalisation' or 'naturalization' in series 8333 and limited the results to dates between 1840 and 1905.

Harvesting was done using the notebook in the [Archives New Zealand section](https://glam-workbench.net/archway/) of the GLAM Workbench.

The harvested dataset is available as a [CSV file](series8333_naturalisation_1840_1905.csv).

This is a correspondence series, not a register of naturalisations, so we can't be sure how accurately the number of files reflects the number of naturalisations. But it's a useful point of reference.

In [1]:
import pandas as pd
import altair as alt

First load the harvested data.

In [3]:
df = pd.read_csv('series8333_naturalisation_1840_1905.csv')
df.head()

Unnamed: 0,Access status,Accession,Agency,Alternative no.,Box/Item,Date,Former archives ref,Item ID,Part,Record group,Record no.,Record type,Sep,Series,Title
0,OPEN ACCESS,,ACGO,,126 /,1854 - 1854,IA1,R23522329,,IA1,1854/17,Text,,8333,"From: Josiah Flight, Resident Magistrate, New ..."
1,OPEN ACCESS,,ACGO,,126 /,1854 - 1854,IA1,R23522356,,IA1,1854/249,Text,,8333,Date: 19 January 1854 Subject: Proclamation - ...
2,OPEN ACCESS,,ACGO,,127 /,1854 - 1854,IA1,R23522383,,IA1,1854/412,Text,,8333,"From: Robert Henry Wynyard, Officer Administer..."
3,OPEN ACCESS,,ACGO,,128 /,1853 - 1854,IA1,R23522418,,IA1,1854/621,Text,,8333,"From: Robert Henry Wynyard, Officer Administer..."
4,OPEN ACCESS,,ACGO,,131 /,1854 - 1854,IA1,R23522493,,IA1,1854/1310,Text,,8333,"From: Robert Henry Wynyard, Officer Administer..."


Extract a value for the year from the date range.

In [4]:
df['year'] = df['Date'].str.slice(0,4)

Aggregate the number of files by year.

In [11]:
year_counts = df['year'].value_counts().to_frame().reset_index()
year_counts.columns = ['year', 'count']
year_counts.head()

Unnamed: 0,year,count
0,1899,817
1,1905,704
2,1893,606
3,1887,504
4,1890,423


Visualise the results.

In [17]:
alt.Chart(year_counts).mark_bar(size=9).encode(
    x=alt.X('year:Q', axis=alt.Axis(format='c')),
    y='count:Q',
    tooltip=['year', 'count']
).properties(width=700)