No description or website provided.
Python
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
data
.gitignore
README.md
harvest.py
indexes.py

README.md

SRNSW Index Harvester

Here you'll find code to harvest data from the indexes published online by State Records NSW and save them as CSV files.

I've pre-harvested all the current indexes for your convenience. You can browse them below, or poke around in the data directory. You can also download a zip file (about 54mb) containing the complete repository.

The repository includes two versions of each index. The web layout of the indexes on the State Archives site includes a number of empty columns. I've harvested and saved these as they are, but I have also created a 'cleaned' version with the unnecessary columns removed. Both versions are provided so you can check that nothing important has been lost in the cleaning process.

One index, the 'Early Convict Index', is linked on the State Archives site to an old version, however, a new version exists and this is the version I've harvested. Strangely, my harvest results in more rows than seem to be present on the site. I'm not sure why this is.

There are more details about the available indexes on the SRNSW site.

Thanks to the SRNSW staff and volunteers for preparing all this most excellent data.

SRNSW content in copyright is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) licence. See their copyright page for more information.

Harvested indexes

Currently: 60 indexes harvested with 1,488,222 rows rows of data.

Name Number of rows Download data View at SRNSW
Assisted Immigrants 191688 CSV file Web site
Australian Railway Supply Detachment 65 CSV file Web site
Bankruptcy Index 28880 CSV file Web site
Bench of Magistrates cases, 1788-1820 4442 CSV file Web site
CSreLand 10849 CSV file Web site
Child Care and Protection 21980 CSV file Web site
Closer Settlement Transfer Registers, NRS 8082 4957 CSV file Web site
Closer and Soldier Settlement Transfer Files 4503 CSV file Web site
Colonial Secretary Main series of letters received,1826-1982 7638 CSV file Web site
Convict Index 141854 CSV file Web site
Convicts Applications to Marry 1825-51 5770 CSV file Web site
Coroners Inquests 1796-1824 808 CSV file Web site
Court of Civil Jurisdiction index 2876 CSV file Web site
Criminal Court Records index 1788-1833 5028 CSV file Web site
Criminal Indictments, 1863-1919 15701 CSV file Web site
Deceased Estates 267945 CSV file Web site
Depasturing Licenses 7449 CSV file Web site
Divorce Index 21240 CSV file Web site
Early Convict Index 12940 CSV file Web site
FieldBooks 813 CSV file Web site
Government Architect 2373 CSV file Web site
Government Asylums for the Infirm and Destitute 10264 CSV file Web site
Governor’s Court Case Papers, 1815-1824 3790 CSV file Web site
Index on Occupants on Aboriginal Reserves, 1875 to 1904 80 CSV file Web site
Index to 1841 Census 9355 CSV file Web site
Index to Closer Settlement Promotion 4354 CSV file Web site
Index to Court of Claims 1052 CSV file Web site
Index to Deposition Registers 65790 CSV file Web site
Index to Early Probate Records 1627 CSV file Web site
Index to Gaol Photographs 48171 CSV file Web site
Index to Intestate Estate Case Papers 22520 CSV file Web site
Index to Miscellaneous Immigrants 8821 CSV file Web site
Index to Quarter Sessions cases, 1824-37 6232 CSV file Web site
Index to Registers of Firms 45683 CSV file Web site
Index to Squatters and Graziers 9003 CSV file Web site
Index to Vessels Arrived, 1837 - 1925 120083 CSV file Web site
Index to convict exiles, 1846-50 3036 CSV file Web site
Index to the Unassisted Arrivals NSW 1842-1855 135792 CSV file Web site
Indigenous Colonial Court Cases 1788-1838 66 CSV file Web site
Insolvency Index 23108 CSV file Web site
King’s and Queen’s Counsel Appointments 2083 CSV file Web site
LandGrants 5627 CSV file Web site
List of Maps and Plans (and Supplement) 5455 CSV file Web site
NSW Chemists and Druggists 2967 CSV file Web site
NSW Government Employees Granted Military Leave, 1914-1918 13735 CSV file Web site
NSW Govt Railways and Tramways - Roll of Honour - 1914-1919 1214 CSV file Web site
Naturalisation 9860 CSV file Web site
Nominal Roll of the First Railway Section (AIF) 417 CSV file Web site
Publicans Licenses 18457 CSV file Web site
Railway Employment Records 763 CSV file Web site
Register of Auriferous Leases 53076 CSV file Web site
Registers of Nurses 26665 CSV file Web site
Registers of Police 11319 CSV file Web site
Registers of Settlement Purchases 9776 CSV file Web site
Returned Soldier Settlement Loan Files 7642 CSV file Web site
Returned Soldiers Settlement Misc files 1916-25 1050 CSV file Web site
Schools 21245 CSV file Web site
Surveyor General - Letters received 1822-55 156 CSV file Web site
Teachers Rolls 14867 CSV file Web site
Unemployed in Sydney 1866 3222 CSV file Web site

Create your own harvest

You'll need to have Robobrowser installed.

To create a list of all the available indexes and their urls, do:

import indexes
indexes.list_indexes()

This generates a CSV file listing the index name and url.

To harvest all the indexes you can do:

import indexes
indexes.get_all_indexes()

To harvest an individual index, find the one you want in the indexes.csv file and copy the row. The you can do:

import harvest
harvest.get_index([paste the row in here])