Skip to content
Howard Hamilton edited this page Jan 16, 2016 · 4 revisions

Welcome to the Marcotti wiki!

Marcotti is the new name of the Football Match Results Database schema. It is a framework created by Soccermetrics to create match databases for football (soccer) matches.

Marcotti captures major match events for teams, whether they be clubs or national teams, that participate in friendly matches or league, knockout, and hybrid (league+knockout) competitions, as well as summary in-match statistics of participating players. These data are used to enable and support research activities for the benefit of the football analytics community.

The Name

Yes, the Marcotti data schemas are named in honor of football pundit Gabriele Marcotti. The project has nothing to do with him, but he did appreciate the gesture.

Why Marcotti?

Analytics projects depend on data, and the collection and preprocessing of data takes up between 60-80% of a typical project. As project scope gets more complicated, the challenge of collecting and wrangling data becomes more daunting and painful. This data schema project grew out of a desire to collect football data once and access it multiple times and in multiple ways.

History of Marcotti

The Marcotti schema was originally called the Football Match Results Database. It was created in 2011 and was refined and extended over the following years. The data models defined by the schema served as the foundation for the Soccermetrics API products, and an analytics library was written to interact with the models.

The following links provide context around Marcotti and the original design decisions:

The Current Marcotti Schema

The original data schema consisted of scripts that defined tables and views in raw SQL. Two schemas are created: one for matches involving club teams, and another for matches involving national teams.

The current data schema makes use of the SQLAlchemy database library to map the database tables to Python classes that define the corresponding data models. This allows us to build a collection of data models that are common to club and national team schemas. It also permits the creation of base models with common attributes and methods that are then inherited by other models.

Using SQLAlchemy to define the data schema allows us to offload low-level read/write operations to the library. It also makes it easier to write test suites for these data models.