vHMML Data Portal and JSON
vHMML Data Portal makes it possible for you to create a custom dataset for personal research or for digital humanities projects. Choose your own search criteria and then focus your search results using the same tools as in vHMML Reading Room.
vHMML Data Portal also provides users with the option of exporting curated datasets or the complete vHMML Reading Room dataset. Datasets are downloadable in JSON format, which provides the widest range of options for repurposing vHMML metadata for your projects.
HMML has uploaded sample digital humanities projects using metadata from vHMML Reading Room on its digital humanities resource site vHMML DH.
Downloading Metadata From vHMML Data Portal
One of the tools you can utilize to work with your vHMML JSON datasets, is this Python script to convert your downloaded JSON Listing data into a CSV file.
Don’t know Python? Don’t worry, you don’t have to! The nice thing about Python is that it is already installed on a Mac. But, if you don’t have a Mac, you can use a service like PythonAnywhere to run the program instead.
To start, perform your search in the Data Portal.
Click the Export Table Search Data button to download your data. Note: The python script only works on Table Search Data. To work with Full Search Data, try one of the other Data Portal tools.
Say Yes to download.
Using Python on a MAC
Download the Python script you’ll need (https://github.com/vHMML/vhmml-CSV-Listing-Data)
Open a terminal (via Launchpad, Other folder) and navigate to the directory with both files.
This is typically a “cd” command to “change directory”.
Once in the directory that contains both files, type “python” (to run python), then the python script file name, then the json file name. Then hit enter. For example:
Then you can open your CSV
Note, you should read the open a CSV file guide to understand unicode and CSV.
Using Python on a PC: PythonAnywhere
If you don’t have a Mac, or want to run your Python script in the cloud, you can use PythonAnywhere.
First create an account at pythonanywhere.
Then Login. From the Dashboard click FILES in the upper right and create a “vhmml” directory (yellow input box).
Change to the new vhmml directory and click the yellow “Upload a file” button to upload the Table Results Data JSON File.
From here you can upload the Python script you downloaded from Github (https://github.com/vHMML/vhmml-CSV-Listing-Data)
Or, you can instead clone the Python script. From the bash command line run: git clone https://github.com/vHMML/vhmml-CSV-Listing-Data.git
This will close the GitHub repository from vHMML onto your PythonAnywhere account.
This should be what your PythonAnywhere will look like. The Dataset JSON and Python program in the same folder.
From here, click the “Open Bash console here” link. This will open something that looks and acts like terminal.
You can run “ls” to “list” your files. Or run the program with “python vhmmlCSVfromListingData.py XXXXX.json”
After the file has been converted, click the hamburger menu and choose Files.
Navigate to your files, and download your new CSV file using the download icon.
Finally, open the file in Excel. Note, you should read the open a CSV in Microsoft Excel guide to understand unicode and Excel.