# Grapher schema compliance

_Do grapher chart configs actually match the schema that we have in our codebase?_

## Summary of findings

- ~4.5% of grapher configs fail validation (195 charts)
- Two big sources of errors
    - Setting null instead of setting no value at all
    - Adding colour palettes that aren't known to the schema
- Plus a range of smaller issues that are valid errors, or at least cases where the schema was not updated for a change

## Preamble

In [1]:
import jsonschema

In [2]:
import json

## Fetch all live configs

In [3]:
%%mysql -p live-readonly -i id

select id, config from charts

In [4]:
df['config'] = df.config.apply(json.loads)

In [5]:
df.head()

Unnamed: 0_level_0,config
id,Unnamed: 1_level_1
20,"{'id': 20, 'map': {'time': 'latest', 'colorSca..."
26,"{'id': 26, 'map': {'colorScale': {'baseColorSc..."
27,"{'id': 27, 'map': {'colorScale': {'baseColorSc..."
29,"{'id': 29, 'map': {'colorScale': {'baseColorSc..."
31,"{'id': 31, 'map': {'time': 2020, 'colorScale':..."


## Load the schema

In [6]:
schema = json.load(open('/Users/lars/Documents/owid/owid-grapher/packages/@ourworldindata/grapher/src/schema/grapher-schema.003.json'))

## Validate every config

In [7]:
def validation_error(config):
    try:
        jsonschema.validate(config, schema)
    except Exception as e:
        return e

In [8]:
df['validation_error'] = df.config.apply(validation_error)

In [9]:
df['is_valid'] = df.validation_error.isnull()

In [10]:
df.is_valid.mean()

0.9549861495844876

In [11]:
(~df.is_valid).sum()

195

## What are the validation errors?

In [12]:
df[~df.is_valid].validation_error.apply(lambda e: e.message).value_counts()

None is not of type 'string'                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     53
Additional properties are not allowed ('chartId', 'id', 'order' were unexpected)                                                                                                                                                                                                                                                                                                                                                                                                                    