Playing with soccer datasets to illustrate some of Spark's features in Java.
If you are in interested in using Spark in Action, 2e, consider the Spark in Action, 2e book by Jean-Georges Perrin and published by Manning. Find out more about the book at: https://www.manning.com/books/spark-with-java.
Designed for Apache Spark v3.0.0.
Examples are making basic analytics with Spark, nothing you could not do without a SQL database, except maybe, the prediction of the winner of the soccer World Cup 2022.
Datasets used in those labs are coming from:
https://www.kaggle.com/jsppimentel99/coparussiajogos
- Cup.Russia.Matches.csv
- Cup.Russia.Teams.csv
https://www.kaggle.com/abecklas/fifa-world-cup#WorldCups.csv
- WorldCupMatches.csv
- WorldCupPlayers.csv
- WorldCups.csv
https://data.world/sawya/football-world-cup-2018-dataset
- Fixture.csv
- Players_Score.csv
- Players_Stats.csv
- Players.csv
- Teams.csv