You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You can download the data tables in several formats (CSV, HTML, TXT, XML) or the full corpus as raw text.
I don't know how these data sets were created, but they're missing some files. In particular, while I was shutting down the Systems Strategy (Legacy) account (#5669), I found the S3 bucket that used to serve these snapshots. After unpacking the various zips and comparing files, I found we're missing some files in the current snapshots:
The full text corpuses don't match – e.g. we have two copies of Barking.1958.b1978448x.txt and BethnalGreen.1921.b18219962.txt is entirely missing from our current snapshots
The data tables in our current snapshots only go up to 1972, but we have tables from 1973 to 1978 in the old buckets, e.g. CityofLondon.1973.b18253908.csv
We should fold these back into the public snapshots so everyone can get this data! Or work out why they were excluded (but since the digitised files are available online, I can't see why they would be).
The text was updated successfully, but these errors were encountered:
We provide snapshots of the MOH reports (Medical Officer of Health reports) at https://developers.wellcomecollection.org/docs/datasets#london-moh-reports
You can download the data tables in several formats (CSV, HTML, TXT, XML) or the full corpus as raw text.
I don't know how these data sets were created, but they're missing some files. In particular, while I was shutting down the Systems Strategy (Legacy) account (#5669), I found the S3 bucket that used to serve these snapshots. After unpacking the various zips and comparing files, I found we're missing some files in the current snapshots:
Barking.1958.b1978448x.txt
andBethnalGreen.1921.b18219962.txt
is entirely missing from our current snapshotsCityofLondon.1973.b18253908.csv
I've uploaded all the files that aren't in the current snapshots to https://eu-west-1.console.aws.amazon.com/s3/buckets/wellcomecollection-assets-workingstorage?region=eu-west-1&prefix=moh-reports/&showversions=false, to make them easier to find.
We should fold these back into the public snapshots so everyone can get this data! Or work out why they were excluded (but since the digitised files are available online, I can't see why they would be).
The text was updated successfully, but these errors were encountered: