We are team 7 and we made the ContextKenner to help sift through the data and find the important pieces.
If you go to https://n-dijkstra.shinyapps.io/ContextKenner/ you can checkout our results.
The importance of the pieces is calculated by scoring each word for its relative importance, determined by its rarity in the corpus versus is a particular piece. The scores for the pieces are then a summation of these scores.
We have also tried different methods for calculating importance: all steps are found in the Jupyter Notebook.
Bram vdn Heuvel Neele Dijkstra