This project analyses the programming practices of the New York Philharmonic's. In particular, it is interested how often the Philharmonic programs works and composers that have not previously been programmed in the ensemble's history.
For the purpose of this project, decisions to perform new works or composers are considered forms of "expansive programming," as they expand the orchestra's repertoire.
This project is an independent analysis conducted using data from the New York Philharmonic Archives, which has released the organization's performance history under a public domain license. The analysis and findings in this project are independent and do not represent official positions of the New York Philharmonic.
This README file was create on 2024-11-14 and was last updated on 2024-11-18.
Jake Gibson
Pratt Institute
jgibso92@pratt.edu
ORCID: 0009-0005-0099-9197
September 2024 – December 2024
Gibson, J. (2024). New York Philharmonic Programming Analysis: New Works and Composers [Dataset]. Git Hub. https://github.com/jakegibb/nyphil-programming-analysis
A binder environment has been created for this project. Launch the project in binder to run the project's Jupyter Notebooks in an isolated computing environment.
Additionally, each notebook has an embed link to open the notebook in Google Colab if that is a preferred environment. However, to run a notebook in Colab, you will need to download the appropriate data source and add it to Colab's "content" folder.
- File: 'nyphil_performance_history_data_restructuring.ipynb'
- Functions:
- Transforms the performance history JSON data from the New York Philharmonics to be structured around individual performances of orchestral works instead of unique programs.
- Counts the number of new works and new composers performed in each season of the New York Philharmonic.
- Output:
- 'nyphil_programming_data.csv'
- Functions:
- File: 'nyphil_expansive_programming_data_by_season.csv'
- Counts of new works and new composers performed by the New York Philharmonic from 1842-2023, as well as basic descriptive statistics for this data.
- See "Data Structure" section for variables and their definitions.
- File: 'nyphil_programming_analysis.ipynb'
- Plots data from 'nyphil_expansive_programming_data_by_season.csv' using matplotlib.
- File: 'nyphil_proramming_vizzu.ipynb'
- Uses the ipyvizzu library to create an interactive visualization of new work and composer data.
- GitHub repository cloned on 2024-09-10 from the New York Philharmonic Archives.
- Please see the README file enclosed in the folder for more information regarding the repository's structure, data structure, variable definitions, and usage guidelines.
JSON data from the New York Philharmonic Archives is structured around unique programs performed by the orchestra. To analyze the frequency of "expansive programming" decisions, the data needs to be restructured around individual performances of single works.
The following steps are used to transform the performance history data:
- JSON Data is "exploded" to create a record for every work performed on a unique program.
- Each work is assigned a performance date based on the earliest date a unique program was performed.
- The total number of works and composers performed in each season are counted.
- Copies of the dataset are filtered for only the first performance of a unique work or composer. These subsets are then grouped by season and counted.
- Descriptive statistics are aggregated in a single DataFrame and exported to CSV.
- Secondary Jupyter Notebooks import the CSV data, run statistical tests, and create visualizations.
Jupyter Notebooks for this project were written using VSCode and Python 3.12.6. The following libraries and packages (beyond the standard Python library) were used:
A "requirements.txt" file is included in the repository for ease of reproducing the Python environment.
The original JSON data structure from the New York Philharmonic – as documented in the enclosed README – is as follows:
{
"programs": [
{
"id": "38e072a7-8fc9-4f9a-8eac-3957905c0002", // GUID
"programID": "3853", // NYP Local ID
"orchestra": "New York Philharmonic",
"season": "1842-43",
"concerts": [
{
"eventType": "Subscription Season",
"Location": "Manhattan, NY",
"Venue": "Apollo Rooms",
"Date": "1842-12-07T05:00:00Z",
"Time": "8:00PM"
},
/* A program can have multiple concerts */
],
"works": [
{
"ID": "8834*4", // e.g. "1234*1" - first part is the Work ID, second part is the NYP Movement ID
"composerName": "Weber, Carl Maria Von",
"workTitle": "OBERON",
"movement": "\"Ozean, du Ungeheuer\" (Ocean, thou mighty monster), Reiza (Scene and Aria), Act II",
"conductorName": "Timm, Henry C.",
"soloists": [
{
"soloistName": "Otto, Antoinette",
"soloistInstrument": "Soprano",
"soloistRoles": "S"
},
/* more soloists, if applicable. If no soloists, this will be an empty array */
]
},
/* a program will usually have multiple works */
{
"ID": "0*",
"interval": "Intermission",
"soloists": []
},
/* Intermissions will also appear in the works array */
]
},
/* more programs */
]
}
The following are selected variable definitions from the New York Philharmonic that are particularly relevant for this analysis (more detailed definitions are documented in the README file enclosed within the cloned repository):
Field | Description |
---|---|
Season | Defined as Sep 1 - Aug 31, displayed "1842-43" |
concerts.Date | Full ISO date used, but ignore TIME part (1842-12-07T05:00:00Z = Dec. 7, 1842) |
works.ComposerName | Composer Last name, first / TITLE (NYP short titles used) |
works.WorkTitle | Work Title as cataloged by NYP |
Data processing in 'programming analysis.ipynb' outputs a CSV file containing descriptive statistics for "expansive programming" decisions grouped by season.
The following is a list of variables and definitions for each field in the resulting CSV file.
Field | Description | Type |
---|---|---|
season | Defined as Sep 1 - Aug 31, displayed as a two year range (e.g."1842-43") | str |
total_composers | The total number of composers performed within one season. | int |
new_composers | The number of composers within a season that are being performed for the first time in the orchestra's history. | int |
repeat_composers | The number of composers performed within a season that have previously been performed by the orchestra. | int |
new_composers_p | The proportion of composers performed within a season that are 'new_composers' (i.e. have never been performed by the orchestra). | float |
repeat_composers_p | The proportion of composers performed within a season that are 'repeat_composers' (i.e. have previously been performed by the orchestra). | int |
total_works | The total number of works performed within a season (if multiple movements of a single work are performed, that is counted as one work). | int |
new_works | The number of works within a season that are being performed for the first time in the orchestra's history. | int |
repeat_works | The number of works performed within a season that have previously been performed by the orchestra. | int |
new_works_p | The proportion of works performed within a season that are 'new_works' (i.e. have never been performed by the orchestra.) | float |
repeat_works_p | The proportion of works performed within a season that are "repeat_works" (i.e. have previously been performed by the orchestra). | float |
new_works_and_composers_p | The proportion of 'new_works' on a season that are also composed by 'new_composers'. | float |
season_year | The start year of an orchestral season. Useful for filtering and sorting. | int |
decade_bin | Range created by pandas.cut function to bin seasons by decade. Decades are defined from ###0 – ###9 (i.e. 1980-1989). | category |
decade | Single integer representing the start of a decade. Decades are defined from ###0 – ###9 (i.e. 1980-1989). | int |
This project is released under a CC0 1.0 Universal License.
The New York Philharmonic data underlying this analysis was also released under the same CC0 license. Please see the usage guidelines in dataset's original README (located in the "nyphilarchive_performance_history" folder) for more information.