# Pulling data from the programme ratings endpoint using pybarb

In this demo we will show you how to pull data from the programme ratings endpoint and then manipulate it using the pybarb library. 

We illustrate this using the following use case: 
The BBC would like to see how its regular daily news slots have performed over the last couple of years. In particular they would like to pick out any important trends and events over a timeseries of audience figures. 

Note the full API documentation can be found [here](https://barb-api.co.uk/api-docs). 

It might also be useful to consult the [Getting Started](https://barb-api.co.uk/api-docs#section/Getting-started) section for information about authentication and basic API usage.


## Querying the API with pybarb

First we connect to the API using the `pybarb` package as described in "Connecting to the Barb API using Python". 

In [1]:
import json
import pybarb as pb

# Set the working directory
working_directory = '/path/to/your/dir/'

# Get the access token
with open(working_directory + "creds.json") as file:
    creds = json.load(file)

# Create a BarbAPI object and connect
barb_api = pb.BarbAPI(creds)
barb_api.connect()


## Get data from the API

To pull the right data from the API we need to know the correct station name and panel name. 

### Getting the station name

The `list_stations` method can be used to search all valid station names that contain 'BBC'

In [2]:
barb_api.list_stations("bbc")

['BBC1',
 'BBC1 Network',
 'BBC2',
 'BBC2 Network',
 'BBC Scotland',
 'BBC HD',
 'BBC Three',
 'CBBC',
 'BBC4',
 'BBC Parliament',
 'BBC Knowledge',
 'BBC Choice England',
 'BBC News',
 'BBC RB HD',
 'BBC RB 2',
 'BBC RB 3',
 'BBC Winter Olympics Red Button',
 'BBC RB 0',
 'BBC RB 4',
 'BBC RB 5',
 'BBC RB 603',
 'BBC RB 7',
 'BBC RB 8',
 'BBC RB 602',
 'BBC FREEVIEW 301 HD',
 'BBC Olympics 1',
 'BBC Olympics 2',
 'BBC Olympics 3',
 'BBC Olympics 4',
 'BBC Olympics 5',
 'BBC Olympics 6',
 'BBC Olympics 7',
 'BBC Olympics 8',
 'BBC Olympics 9',
 'BBC Olympics 10',
 'BBC Olympics 11',
 'BBC Olympics 12',
 'BBC Olympics 13',
 'BBC Olympics 14',
 'BBC Olympics 15',
 'BBC Olympics 16',
 'BBC Olympics 17',
 'BBC Olympics 18',
 'BBC Olympics 19',
 'BBC Olympics 20',
 'BBC Olympics 21',
 'BBC Olympics 22',
 'BBC Olympics 23',
 'BBC Olympics 24',
 'BBC RB 6',
 'BBC RB 6781',
 'BBC RB 6785',
 'BBC RB 6786',
 'BBC RB 6787',
 'BBC RB 6788',
 'BBC RB 6789',
 'BBC RB 6790',
 'BBC RB 601',
 'BBC RB 1

### Getting the panel name

Similarly, the `list_panels` method can be used to search all valid station names that contain 'BBC'

In [3]:
barb_api.list_panels("bbc")

['BBC Network',
 'BBC East Region',
 'BBC West Region',
 'BBC South West Region',
 'BBC South Region',
 'BBC Yorkshire & Lincolnshire',
 'BBC North East & Cumbria',
 'BBC North West Region',
 'BBC Scotland Region',
 'BBC Ulster Region',
 'BBC Wales Region',
 'BBC Midlands West',
 'BBC Midlands East',
 'BBC London',
 'BBC South East']

### Querying the programme ratings endpoint

Now we know all the relevant metadata we can query the programme ratings endpoint. This can be done very simply using pybarb's `programme_ratings` method. 

In [4]:
programme_data = barb_api.programme_ratings(min_transmission_date = "2021-01-01",
                           max_transmission_date = "2022-01-01", 
                           station =  "BBC1", 
                           panel="BBC Network")

## Accessing the data

The raw data is stored in the `api_response_data` attribute of the resulting object (in this case the object named `programme_data`)

In [5]:
programme_data.api_response_data

{'endpoint': 'programme_ratings',
 'events': [{'panel': {'panel_code': 50,
    'panel_region': 'BBC Network',
    'is_macro_region': False},
   'station': {'station_code': 10, 'station_name': 'BBC1'},
   'transmission_log_programme_name': 'LOOK EAST',
   'programme_type': 'programme',
   'programme_start_datetime': {'barb_reporting_datetime': '2021-06-07 13:33:39',
    'barb_polling_datetime': '2021-06-07 13:33:39',
    'standard_datetime': '2021-06-07 13:33:39'},
   'programme_duration': 9,
   'spans_normal_day': False,
   'sponsor': {'sponsor_code': None, 'bumpers': 'not_sponsored'},
   'broadcaster_transmission_code': '16393663275812',
   'live_status': 'unknown',
   'uk_premier': False,
   'broadcaster_premier': False,
   'repeat': True,
   'programme_content': {'content_name': 'Look East: Series 2021',
    'barb_content_id': 5325087,
    'broadcaster_content_id': 'm000wx4d',
    'metabroadcast_information': {'metabroadcast_content_id': 'nt2952'},
    'episode': {'episode_number': 

However it is easier to access it as a dataframe. To do this, we can use the `to_dataframe()` method, which flattens the nested json structure.

In [6]:
programme_df = programme_data.to_dataframe()
programme_df


Unnamed: 0,panel_region,station_name,programme_name,programme_type,programme_start_datetime,programme_duration_minutes,spans_normal_day,uk_premiere,broadcaster_premiere,programme_repeat,episode_number,episode_name,genre,audience_size_hundreds,date_of_transmission,audience_name,audience_target_size_hundreds
0,BBC Network,BBC1,Look East: Series 2021,programme,2021-06-07 13:33:39,9,False,False,False,True,0.0,07/06/2021,News,981,2021-06-07,All Homes,269930
1,BBC Network,BBC1,Look East: Series 2021,programme,2021-06-07 13:33:39,9,False,False,False,True,0.0,07/06/2021,News,1504,2021-06-07,All Adults,510880
2,BBC Network,BBC1,Look East: Series 2021,programme,2021-06-07 13:33:39,9,False,False,False,True,0.0,07/06/2021,News,804,2021-06-07,All Men,248860
3,BBC Network,BBC1,Look East: Series 2021,programme,2021-06-07 13:33:39,9,False,False,False,True,0.0,07/06/2021,News,903,2021-06-07,All Houseperson,269930
4,BBC Network,BBC1,Look East: Series 2021,programme,2021-06-07 13:33:39,9,False,False,False,True,0.0,07/06/2021,News,0,2021-06-07,All Children aged 4-15,95260
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1521175,BBC Network,BBC1,"Spotlight: Series 2021, Episode 218",programme,2021-01-29 22:33:41,11,False,True,True,False,218.0,29/01/2021,News,0,2021-01-29,Boys 10-12,12320
1521176,BBC Network,BBC1,"Spotlight: Series 2021, Episode 218",programme,2021-01-29 22:33:41,11,False,True,True,False,218.0,29/01/2021,News,425,2021-01-29,"Adults, Lightest Third",170080
1521177,BBC Network,BBC1,"Spotlight: Series 2021, Episode 218",programme,2021-01-29 22:33:41,11,False,True,True,False,218.0,29/01/2021,News,176,2021-01-29,"Adults, Lightest Sixth",85040
1521178,BBC Network,BBC1,"Spotlight: Series 2021, Episode 218",programme,2021-01-29 22:33:41,11,False,True,True,False,218.0,29/01/2021,News,237,2021-01-29,"ABC1 Adults, Lightest Third",96620


## Manipulating the data

We can also get a pivot of the data which turns the audiences into columns.

In [7]:
programme_data.audience_pivot()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,audience_name,"ABC1 Adults, Lightest Third",Adults 16-24,Adults 16-34,"Adults 16-34, Lightest Third",Adults 18-20,Adults 21-24,Adults 35-44,Adults 45-49,Adults 45-54,Adults 55-64,...,Men AB,Men AB working full-time,Men ABC1,Men ABC1 16-24,Men ABC1 16-34,Men ABC1 16-44,Men ABC1 35-54,Men ABC1 working full-time,Men C2,Men working full-time
panel_region,station_name,date_of_transmission,programme_name,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1
BBC Network,BBC1,2021-01-01,Archbishop of Canterbury's New Year Message: Series 2021,347,59,620,124,29,30,1438,601,1468,1189,...,988,458,1938,24,148,390,570,936,879,1651
BBC Network,BBC1,2021-01-01,BBC London,176,21,271,23,3,17,813,1039,1524,1210,...,836,573,1788,0,0,283,843,1066,257,1416
BBC Network,BBC1,2021-01-01,BBC News at One: Series 2021,401,121,594,87,10,60,1730,1131,2783,2630,...,1985,485,4167,0,61,333,796,1226,1931,2447
BBC Network,BBC1,2021-01-01,BBC News at Six: Series 2021,990,420,1178,289,77,264,1805,1768,4174,5194,...,2567,1047,6021,128,297,778,1432,2944,2005,4819
BBC Network,BBC1,2021-01-01,BBC News at Ten: Series 2021,2164,428,1550,94,174,206,2231,2194,5680,6868,...,4960,1832,9219,95,393,864,1757,3777,4294,7059
BBC Network,BBC1,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
BBC Network,BBC1,2022-01-01,"The Weakest Link: Series 1, Episode 7",1690,1197,2790,400,271,665,2328,2134,5618,7559,...,3810,1652,8591,301,810,1188,1850,3861,2630,6141
BBC Network,BBC1,2022-01-01,Weather for the Week Ahead: Series 2021,247,351,879,148,141,48,753,441,1082,565,...,333,333,628,42,42,82,418,492,197,1102
BBC Network,BBC1,2022-01-02,BBC News,32,88,175,12,52,35,138,101,236,396,...,93,50,228,6,14,96,107,153,158,251
BBC Network,BBC1,2022-01-02,FILM: Man Up (2015),147,63,302,34,16,39,403,458,912,607,...,444,374,985,0,35,368,634,836,290,953


## Filtering for the news programmes

We can search the programme_names to get the ones we are looking for.

In [8]:
programme_df['programme_name'] = programme_df['programme_name'].str.split(':', expand=True)[0]
programme_df['programme_name'][programme_df['programme_name'].str.contains('News')].unique()

array(['BBC News at One', 'BBC Weekend News', 'BBC Newsline', 'BBC News',
       'BBC News at Ten', 'BBC News at Six', 'Joins BBC News',
       'Have I Got a Bit More News for You', 'BBC News Special',
       'Newscast', 'Have I Got News for You', 'BBC Scotland News Special',
       'News 24', "Ten O'Clock News", 'Prince William does Newscast'],
      dtype=object)

Now we filter for just the regular news programmes.

In [9]:
bbc_news = programme_df[programme_df['programme_name'].isin(['BBC News at Six',
       'BBC News at Ten', 'BBC News at One', 'BBC Weekend News'])]
bbc_news_all_homes = bbc_news[bbc_news['audience_name']=="All Homes"].sort_values(["programme_name", "programme_start_datetime"])


In [10]:
bbc_news_all_homes

Unnamed: 0,panel_region,station_name,programme_name,programme_type,programme_start_datetime,programme_duration_minutes,spans_normal_day,uk_premiere,broadcaster_premiere,programme_repeat,episode_number,episode_name,genre,audience_size_hundreds,date_of_transmission,audience_name,audience_target_size_hundreds
690840,BBC Network,BBC1,BBC News at One,programme,2021-01-01 12:40:30,14,False,True,True,False,0.0,01/01/2021,News,14811,2021-01-01,All Homes,269680
1254600,BBC Network,BBC1,BBC News at One,programme,2021-01-04 13:00:00,33,False,True,True,False,0.0,04/01/2021,News,31899,2021-01-04,All Homes,269680
1272660,BBC Network,BBC1,BBC News at One,programme,2021-01-05 13:00:00,34,False,True,True,False,0.0,05/01/2021,News,29484,2021-01-05,All Homes,269680
1276080,BBC Network,BBC1,BBC News at One,programme,2021-01-06 13:00:00,34,False,True,True,False,0.0,06/01/2021,News,28517,2021-01-06,All Homes,269680
1292580,BBC Network,BBC1,BBC News at One,programme,2021-01-07 13:00:00,34,False,True,True,False,0.0,07/01/2021,News,28444,2021-01-07,All Homes,269680
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
39900,BBC Network,BBC1,BBC Weekend News,programme,2021-12-26 12:55:00,12,False,True,True,False,0.0,26/12/2021,News,12733,2021-12-26,All Homes,270410
1980,BBC Network,BBC1,BBC Weekend News,programme,2021-12-26 22:26:10,13,False,True,True,False,0.0,26/12/2021,News,17023,2021-12-26,All Homes,270410
977940,BBC Network,BBC1,BBC Weekend News,programme,2022-01-01 13:00:10,12,False,True,True,False,0.0,01/01/2022,News,9950,2022-01-01,All Homes,270570
976320,BBC Network,BBC1,BBC Weekend News,programme,2022-01-01 17:10:00,11,False,True,True,False,0.0,01/01/2022,News,21965,2022-01-01,All Homes,270570


## Plotting the data

In [11]:
import plotly.express as px
px.line(bbc_news_all_homes, x="programme_start_datetime", y="audience_size_hundreds", color="programme_name", width = 1300, height=500)