Web scraper for parsing, transforming, and storing data from Jeopary's Jeopardata webpage. It is also the Extract, Transform, and Load process that powers this Tableau Public Dashboard.
- Full data scrape and store
- Incremental data scrape and store
- Provide an API and file export mechanisms for data analyst Jeopardy! fans to easily explore Jeopardata
- Analyze and identify patterns in Jeopardy! gameplay over time, such as presenting inflection points in data from influential players like James Holzhauer
- Provide a location to store historic Jeopardy! data in a transformed format more suitable for data-driven analysis and applications
*The application is scheduled to execute once per day after Jeopardy! posts the new episode data. The decision to use S3 as the backend was fueled by the desire to reduce costs as much as possible (less than $1/year storage costs).
- File Name: jeopardata.csv
- Sheet Name: jeopardata
- Description: The historical box scores for each game, player in each Jeopardy game
- Database Product: PostgreSQL
- Database Name: jeaopardata
- Table Name: jeopardy_game_box_scores
- Description: The historical box scores for each game, player in each Jeopardy game
Field Name | Data Type | Description |
---|---|---|
EpisodeNumber | varchar(100) | The episode number of the Jeopardy game |
EpisodeTitle | varchar(100) | The title of the Jeopardy episode |
EpisodeDate | date | The airing date of the Jeopardy episode |
ContestantLastName | varchar(100) | The last name of the contestant |
ContestantFirstName | varchar(100) | The first name of the contestant |
HomeCity | varchar(100) | The home city of the contestant |
HomeState | varchar(100) | The home state of the contestant |
IsWinner | boolean | Whether the contestant won the game |
RoundOneAttempts | integer | Number of attempts in round one |
RoundOneBuzzes | integer | Number of buzzes in round one |
RoundOneBuzzPercent | integer | Percentage of buzzes in round one |
RoundOneCorrectAnswers | integer | Number of correct answers in round one |
RoundOneIncorrectAnswers | integer | Number of incorrect answers in round one |
RoundOneCorrectAnswerPercent | integer | Percentage of correct answers in round one |
RoundOneDailyDoubles | integer | Number of daily doubles in round one |
RoundOneScore | integer | Score at the end of round one |
RoundTwoAttempts | integer | Number of attempts in round two |
RoundTwoBuzzes | integer | Number of buzzes in round two |
RoundTwoBuzzPercent | integer | Percentage of buzzes in round two |
RoundTwoCorrectAnswers | integer | Number of correct answers in round two |
RoundTwoIncorrectAnswers | integer | Number of incorrect answers in round two |
RoundTwoCorrectAnswerPercent | integer | Percentage of correct answers in round two |
RoundTwoDailyDouble1 | integer | First daily double found in round two |
RoundTwoDailyDouble2 | integer | Second daily double found in round two |
RoundTwoScore | integer | Score at the end of round two |
FinalJeopardyStartingScore | integer | Starting score for Final Jeopardy |
FinalJeopardyWager | integer | Wager for Final Jeopardy |
FinalJeopardyScore | integer | Score after Final Jeopardy |
TotalGameAttempts | integer | Total number of attempts throughout the game |
TotalGameBuzzes | integer | Total number of buzzes throughout the game |
TotalGameBuzzPercent | integer | Total percentage of buzzes throughout the game |
TotalGameCorrectAnswers | integer | Total number of correct answers throughout the game |
TotalGameIncorrectAnswers | integer | Total number of incorrect answers throughout the game |
TotalGameCorrectAnswerPercent | integer | Total percentage of correct answers throughout the game |
TotalGameDailyDoublesCorrect | integer | Number of daily doubles answered correctly throughout the game |
TotalGameDailyDoublesIncorrect | integer | Number of daily doubles answered incorrectly throughout the game |
TotalGameDailyDoubleWinnings | integer | Total winnings from daily doubles throughout the game |
TotalGameScore | integer | Total score at the end of the game |
TotalTripleStumpers | integer | Total number of triple stumpers throughout the game |
- For each web page
- Collect each episode by DATE
- For each date
4. Collect the names & home city info of the contestants
5. Collect Jeopardy Round Data
6. Collect Double Jeopardy
7. Collect Final Jeopardy Round Data
8. Collect game totals data
9. write to DB
- Check the last date in the DB
- Collect data each episode by DATE from last date until current
- For each date
4. Collect the names & home city info of the contestants - Collect Jeopardy Round Data
- Collect Double Jeopardy Roud Data
- Collect Final Jeopardy Round Data
- Collect game totals data
9. write to DB