# using the `audiences` api

having used python to create queries in the `analytics` application, we turn our attention to the `audiences` app. we would like to be able to use python to automate the creation of a search query in that application as well.

(*my thanks to peter fairfax for suggesting this workshop, and for providing code samples upon which it is built.*) 

## this week's exercise:

using only python code, generate the following three new audiences in the `audiences` application (feel free to change the search criteria to something more relevant to your own interests if you'd like, but make sure you can generate queries of at least the same complexity):

- one audience search for twitter users whose `BIO` contains **any** of these terms: `['dog', 'cat', 'puppy', 'kitten']`
- one audience search for twitter users whose `BIO` contains **all** of these terms: `['dog', 'cat', 'love']`
- one audience search for twitter users whose bio contains **any** of these terms: `['dog', 'cat', 'puppy', 'kitten', 'kitty']` **and** whose `INTERESTS` includes  `Animals & Pets` (see image below) 

![combined search criteria](query_criteria.png)

### bonus challenge (a bit more work):
generate a query for one audience search for twitter users whose `BIO` **or** `TWEETS` contain **any** these terms `['cat','kitten']` **and** whose `INTERESTS` includes  `Animals & Pets` (see image below) 

![bonus challenge query](bonus_challenge.png)


## overview of today's session
compared to how we access the `analytics` app, for which there exists a python SDK (software development kit, i.e. the module `bwapi`), there is a big difference in how we will approach the `audiences` app, which offers no such SDK). however, both applications are built with a API (application programming interface) which we utilise. the difference here is that we will have to write raw `request` calls to the API directly as opposed to via a nice wrapper script. 

this will be a bit messier, and the authentication is annoyingly fickle, but for the simple task above we can live with that - until `audiences` gets its own SDK. 

- first, we will take a look at the audiences application for a short orientation.
- second, we will see how to intercept and monitor the communication between the browser and the `audiences` application backend, so we can learn its structure and mimic it. (here, we will need to pay special attention to the authentication).
- third, we will need to write code to construct a request to the backend a. here, we benefit from peter's efforts last week.
- fourth we will test our code to construct a query.

## an overview of `audiences`:

- log into the `audiences` application.
- create a new query.
- specify the new query by adding at least 3 terms to search for in the `bio` field. make sure to choose the `any` operator. 
- hit the `search` button.
- hit the `save` button and give your new search a name.

## intercepting the communication with the backend

- in your browser, choose the developer view. (cmd-alt-i, in `chrome`)
- choose the `networks` tab
- change your query slightly and hit the `search` button again.
- find the line in the `name` column that begins with `preview?...` and select it.
- in the window to the right of that line, make sure that the `headers` is selected and showing.
- find the `request` headers.
- not the `method` is `POST`.
- note the `request_url` header.
- note the `request header` authorisastion key field.
- note the `request_payload` header

these are the fields that we need to send with the request. look carefully at the structure of the 

In [41]:
# here is what i got when i did it:
request_url = 'https://audiences.brandwatch.com/api/audiences/audiences/preview?start=0&count=5&page=0&sort=influence&direction=desc'

# where = 'start=0&count=50&page=0&sort=influence&direction=desc' specifies how to 
# filter results and return order
request_payload = \
'{\
  "id":"5c3130dd5c416100010ab2d7",\
  "query":\
  {\
    "operator":"AND",\
    "children":[\
      {\
        "operator":"OR",\
        "children":\
        [\
          {\
            "operator":"OR",\
            "field":"BIO",\
            "value":["cat"]\
          },\
          {\
            "operator":"OR",\
            "field":"BIO",\
            "value":["dog"]\
          },\
          {\
            "operator":"OR",\
            "field":"BIO",\
            "value":["kittten"]\
          },\
          {\
            "operator":"OR",\
            "field":"BIO",\
            "value":["puppy"]\
            },\
            {\
              "operator":"OR",\
              "field":"BIO",\
              "value":["pup"]\
            },\
            {\
              "operator":"OR",\
              "field":"BIO",\
              "value":["doggo"]\
            },\
            {\
              "operator":"OR",\
              "field":"BIO",\
              "value":["kitty"]\
            },\
            {\
              "operator":"OR",\
              "field":"BIO",\
              "value":["feline"]\
            }\
          ]\
        },\
        {\
          "operator":"AND",\
          "children":\
          [\
            {\
              "field":"INTERESTS",\
              "value":["Animals & Pets"],\
              "operator":"OR"\
            }\
          ]\
        }]\
  }\
}'
authorisation_key = 'blablaeyJlbWFpbCI6Im9ob2xtQGJyYW5kd2F0Y2guY29tIiwiZmlyc3ROYW1lIjoiT3NrYXIiLCJsYXN0TmFtZSI6IkhvbG0iLCJjbGllbnRJZCI6MTk5NzM5MjcwNSwiYXBpMkFjY2Vzc1Rva2VuIjoiZXlKaGJHY2lPaUpTVXpJMU5pSXNJblI1Y0NJNklrcFhWQ0lzSW10cFpDSTZJakEyTUdSbFlXSTNZVGRoTlRFME1qazJORGc1TkdGbE5qVXdNVEkxWmpnMFkyVmtOamt5TmpnaWZRLmV5SnliMnhsY3lJNld5SkNWMTlDUVZOSlExOVZVMFZTSWl3aVFsZGZRVVJOU1U1ZlZWTkZVaUpkTENKelkyOXdaU0k2ZTMwc0ltVjRjQ0k2TVRVME5qYzBNekk0Tnl3aVlYVmtJanBiSW1GMVpHbGxibU5sY3k1aWNtRnVaSGRoZEdOb0xtTnZiU0lzSW1Gd2FTNWljbUZ1WkhkaGRHTm9MbU52YlNKZExDSnBjM01pT2lKc2IyZHBiaTVpY21GdVpIZGhkR05vTG1OdmJTSXNJbk4xWWlJNklqSXdOemN6TmpRNU1pSjkuQjNNQ0FYdHhoUHpWc1ltaGt6Z0pua3BNeHFpcDZwM0RHcFJjbTg0ODNyNjdXbl9FZksyYWc3UGFuV2JIWEM5REc1N3M2UktnMGR5QnNYZkwtbWc1Y1pfd0lSYzhfbGk4TFVydEJsSWpfQms0Z3RmVEREWVI1RXZFdEV3czg5VUg1c2pCM3FCWXdaMzRYam5qbGRVb2tLZk9FUzE0Z3VQRmRYQmpZdllTMEZmQmc1cVg2ZzhWcS1BYXU4cEpjN004TDYwMXFfVTdKV0ZZSlhtZFE3Y2MwMjNmOGlJZUR0TDNka0JnbTZqZGVKYVB0a3dJQTZueUxtSml6ZWxxd3NrTE1jaFd6cERUbmh6SWNZN292ZUt0bGpVVVVmam90TjN1OXJRRlNtV0ZiVTF1V3pRaUMxaWpsbE54MGdGWGhTSm9fQzV0ZnVwTjhudmxLN1g1dEZGR0pZYWlSSU43dER4LTc0anhtbkw3MENzZ1ZMdGNEMFBtTWhzWUtLdU45UnBqQ0U4RlpLUUlKN0dSS0VsWk1yd0g5SHE2VWNhLWpTTnF0Y1FiNGs5cHZpa0ZTRFBRaDJOZHA5UzROcktRSnNPd0s4VUFNNzJXelBqeHVmNEJwTGhDeTlGQThQOHQxNS1iVHFUdy1BRW5vOTJTcU9NMzNUZGp6dTZxeGRKa1AwRV9OX3E2aDNIVUV3bzZHNldPd3FPdDZvMHNGWk5sQ0EzY1h5Z1Y1OG9LdGxjYkduMGdFckkwTVNlUHp3aUhjaDkxaVZYaVoxWjFOcXlqOU42VWhqaWx0amMzSmJLbktlWHJPTkRFdk5GVlFPX1ItUjBKMWRqc2dPd0F2SVFGbHJRb3ZmU1pwTWVveHR3Nk44VWJkMDZiSUhjQjFnRllreHNOM2U4aDcxWE4tZkEiLCJyb2xlcyI6W10sInNjb3BlIjp7fSwiZXhwIjoxNTQ2NzQzMjg3LCJhdWQiOiJhdWRpZW5jZXMuYnJhbmR3YXRjaC5jb20iLCJpc3MiOiJsb2dpbi5icmFuZHdhdGNoLmNvbSIsInN1YiI6IjIwNzczNjQ5MiJ9.UQfSiTOmnlV1R3VwP6YC5nGnzP35QKtHE9Xhy4U3sjOxm-PFUk_pDxKOGEEYdRPQ0RvEd38YL29XUWIJ2h1myFH8pZ6NgEAZGsQKKz-mO27zqJGNq9prCSwkxK3OC5SvV6PiFa9T_pI3UK1VKSzN0gnWrzKRbNIeyvwo2gZ97Z1nnqTgCkdbYsJiKgFJeOk5yHUjCiw27kVrjNHsP8CEq90pIXSDF6ZnbDE7qeavvrTssSEQq1kLLFkddXutkowfqhDs5YT4XOoo6I8fYw2PkNmRaaMre6eFJTHV4Y1OyrwQAdk4zbiLoXqFATx-IwFFZ3S8FZwr6QkL0cJmtkQ257MiFvSj7LwkLpvBQaXqqk5NB-48p5QEhRnxgF78IcTsVAahVeCo5oTQGAHZb74lWukeRtHKvBWgzZgBOochNcoXEl2_RWMftlRwj1ep9ppeApTHIv35zeUCmvliyon-dJK8rZFBy8YV1c47olwEtwSLRG6pv9UsDw8w9TORONR_WbP8vU7aQXZ4c8J7irzlvk3wKKgll2JIbDBpvPJK4oSLK7acoMIA3QTUYXdUZ5Jvt4iazG_8CRWdS7rrpeEiIVsL_3FCWEPYdLheDDYItzz9KlMp-GzrGKWdphLoMux2vUAcDMfmW359ybWd_I0rjArjKU0eJA_Q7MpgIifvAxc'
request_url_save = 'https://audiences.brandwatch.com/api/audiences/audiences?newFormat=true'
user_id = '207736492'

## using python instead of the browser

using the `requests` [module](http://docs.python-requests.org/en/master/), we can send the requests to the backend from python instead of via the browser. 

the challenge is just about how to generating these requests for a given query of interest. 

In [42]:
# we will need the following modules
import json
import requests

In [43]:
# to do that we can use these helper functions written by pf (tweaked by oh):
def create_query_child(search_term_list, search_field, bool_operator='AND'):
    """ 
    given a list of search terms, and a search field to, 
    this function generates and a dict containing a 
    well-formed child query for the audiences app.

    search_term_list: a list of string values to search for
    search_field: a string, one of 'BIO', 'INTERESTS', 'TWEETS'...
    bool_operator: a string, one of ['AND', 'OR']

    returns: a dict containing a well formed child query.    
    """
    bool_operator = bool_operator.upper()
    if bool_operator not in ['AND','OR']:
        print('error. illegal operator', bool_operator)
        return []
    grandchildren = [] # initialise list for collection of search terms
    for search_term in search_term_list:
        grandchild = {'operator':'OR',     # <--- not used?
                      'field':search_field, 
                      'value':[search_term] 
                     }
        grandchildren.append(grandchild)
    child = {'operator':bool_operator, 'children':grandchildren}
    return child


def add_search_term_to_query_child(query_child, search_term, search_field):
    """
    TO BE COMPLETED (tested):
    given a query child dict (assumed to be well-formed),
    add a search term and search field to it. 
    note that the operator applied to the terms is not changed.
    """
    grandchildren = query_child['children']
    new_grandchild = {'operator':'OR',     # <--- not used?
                      'field':search_field, 
                      'value':[search_term] }
    
    grandchildren.append(new_grandchild)
    query_child['children'] = grandchildren

    return query_child


def create_query_payload_json(list_of_children, query_id = None):
    """
    given a list of query children (in the form of dicts),
    this function packs them into the framework required 
    for the query payload
    """
    query = {
        'id':query_id, 
        'query':{
            'operator':'AND',   # obligatory top level operator
            'children':list_of_children
        }
    }
    query_payload = json.dumps(query) # converts dicts to json string
    return query_payload


def get_query_response(request_url, 
                       authorisation_key, 
                       payload, 
                       user_id=None):
    """
    sends payload (assumed to be a well formed query) to audience's request url
    and returns the response object.
    if you want to control the number of accounts returned, you need to edit 
    the request url.
    """
    char_count = str(len(payload))
    request_headers={
        'Authorization': f'bearer {authorisation_key}', 
        'Content-Type': 'application/json',
        'Content-Length': char_count,
#       'userId': user_id
    }

    response = requests.post(request_url, headers=request_headers, data=payload)
    return response


def save_audience_search(payload, authorisation_key, char_count, user_id):
    """
    sends payload (assumed to be a well formed query) to request url
    and returns the response object.
    if you want to limit the number of accounts returned, 
    """

    char_count = str(len(payload))
    url = 'https://audiences.brandwatch.com/api/audiences/audiences?newFormat=true'
    response = requests.post(
        url, 
        headers={'Authorization': f'bearer {authorisation_key}', 
                 'Content-Type': 'application/json', 
                 'Content-Length': char_count, 
                 'userId':user_id}, 
        data=payload)
    return response

#response_save = save_audience(save_payload, authorization, character_count)

In [44]:
# example use:
# let us say we want to search for users who have the terms 
# "aaaaaa" AND "bbbbbbb" 
# in their 'BIO' field
example_child = create_query_child(['dog', 'cat'], 'BIO', bool_operator='AND')
example_child

{'operator': 'AND',
 'children': [{'operator': 'OR', 'field': 'BIO', 'value': ['dog']},
  {'operator': 'OR', 'field': 'BIO', 'value': ['cat']}]}

In [45]:
payload = create_query_payload_json([example_child])
payload

'{"id": null, "query": {"operator": "AND", "children": [{"operator": "AND", "children": [{"operator": "OR", "field": "BIO", "value": ["dog"]}, {"operator": "OR", "field": "BIO", "value": ["cat"]}]}]}}'

In [46]:
request_url

'https://audiences.brandwatch.com/api/audiences/audiences/preview?start=0&count=5&page=0&sort=influence&direction=desc'

In [47]:
#https://audiences.brandwatch.com/api/audiences/audiences/preview?start=0&count=50&page=0&sort=influence&direction=desc

In [48]:
authorisation_key = 'eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCIsImtpZCI6IjA2MGRlYWI3YTdhNTE0Mjk2NDg5NGFlNjUwMTI1Zjg0Y2VkNjkyNjgifQ.eyJlbWFpbCI6Im9ob2xtQGJyYW5kd2F0Y2guY29tIiwiZmlyc3ROYW1lIjoiT3NrYXIiLCJsYXN0TmFtZSI6IkhvbG0iLCJjbGllbnRJZCI6MTk5NzM5MjcwNSwiYXBpMkFjY2Vzc1Rva2VuIjoiZXlKaGJHY2lPaUpTVXpJMU5pSXNJblI1Y0NJNklrcFhWQ0lzSW10cFpDSTZJakEyTUdSbFlXSTNZVGRoTlRFME1qazJORGc1TkdGbE5qVXdNVEkxWmpnMFkyVmtOamt5TmpnaWZRLmV5SnliMnhsY3lJNld5SkNWMTlDUVZOSlExOVZVMFZTSWl3aVFsZGZRVVJOU1U1ZlZWTkZVaUpkTENKelkyOXdaU0k2ZTMwc0ltVjRjQ0k2TVRVME5qZzFPRE0xT1N3aVlYVmtJanBiSW1GMVpHbGxibU5sY3k1aWNtRnVaSGRoZEdOb0xtTnZiU0lzSW1Gd2FTNWljbUZ1WkhkaGRHTm9MbU52YlNKZExDSnBjM01pT2lKc2IyZHBiaTVpY21GdVpIZGhkR05vTG1OdmJTSXNJbk4xWWlJNklqSXdOemN6TmpRNU1pSjkuS202bWxMQUVxSVl2ejdaQUp3WEI0YzhnYkVKVmZfdXRxbmJzN2xmUDFFaU0yVWthTjRESHpYU3NxOWI1RjJBaVhscEthQUhRaGdpd0UwX0RVZDZUVzNUSnRiOGxmTVpEMlRvelhMd005MFZZRXdfb2tYSThta3dPc0EyX1dMU0stVDBZS09jSnRnZGVuXzdoVXJUQTdheEdCVTZMNFg3ZG1ValdObFJ2c3VsR3ZUeDVwanlNakJFdWJHLTlpSkRfR2kzRDJ0SU1GckRLb0NDQmNiS3dpLUNRWkhQenRibXUtaEpoYXNLampYQjJ4NklmTGRETzBwMGs5UFo2aEZOWHJTLVJtOXFRNTBZMC1ITUdwd0wteUtXYVQ2S090cFVQLXE4emlyZ0dpYUE3WWVzWVZ3UnRNLXp5ZDBQbVBsQW9jRnlnd2VJZHdlWGp6WGxabzhRRWJCeHlSMjVLMjd2MDhISXlwYmJlVTZGak5iTXY3c2NjQ1JvbzVPbHFHTGNfNFY2RlBHT2kyOHFfc3dxWlQ4TVd1T25RN3QwbWk1b3A4RVpvc0RVMU16MTFST053bm9TN1hUdmtyWTh2dHFTald2bmE3ZkRIdzJKYnVRU0ZwMTR6Z1NSZUJObUZmU2pRanBURlV3MGxnSXhKSkpyY1A0Y2F6V2JaQmpvTHJtRHplWmZZeElfZ1FkMThrMDZRUlVwNW9ub0tiQjAwbVNBdUxaeHpKa1hTVW80RG5QYzlzR0ZWdEpxLTlNckplSTI3Q0h6NnVHNHItNkZmb2RiQ095Uk9GVnpNQlVlMFZCTC1CQ0gydnJ3Qkc5VmZYVjFDUjAyUWZHamxRYjBlSURtS0lITzFGZzJBRHZWaGliek5WNG1lYTBIVjVYcWZwc2JfTFpJMDZuMTRjTGciLCJyb2xlcyI6W10sInNjb3BlIjp7fSwiZXhwIjoxNTQ2ODU4MzU5LCJhdWQiOiJhdWRpZW5jZXMuYnJhbmR3YXRjaC5jb20iLCJpc3MiOiJsb2dpbi5icmFuZHdhdGNoLmNvbSIsInN1YiI6IjIwNzczNjQ5MiJ9.HcL3DVv-CUv2PE8i0Wav4KJbbtrJ0bzOp7OdUA_UEB7FgvpXCojbTYbzk3HTWSyavu0iviSt6DLUk9p7lPUsw9PXwjSigtKKklHfwNQNP7VQqZHGVPmwPyU-OgYItT77y8QEw10o9bKWDWtjhcA8AJ86kPkZwQkXItj2CvWTIh5UVm0v1hEN6urrjMxlyLvSIU0cEV2EzUj76cm_SlLBYDusb08Yu37Bf7AyZg_41kal8PWtu3mDUvWHVGgSnyb8N8EPQgwW_5dqoKkIZy4i0i9naxlch0duACESrrM-d3PwcRvpIjE2386G10Es-RAmVI50Cqku9J30QWpKVSH350ZzVCtrJreYKOTKJkW1T5umxusmtPTpPp0RN0Rjfy6UQyEws5Kb6gVVK_2i094669z3iwoytsanCQ7IMb3ocHk1x9riU30gYZR5CSNYKhfq96lNAlzxWgkMVfNMXlrkSGCe-N9wML-WBO8Hc3I80-JjvY1OYNQC5TC8pnZkOqgbbBBjD4YoowlnzA8SEECY7WjWH9798sENR83cZBXgsnByPGxxzTwpz5jqf5RtETyqPSmxvYfGXTn2gsJqMwwuJKXmiSXZ5zFc-W395-J3gUxUpAxpoKJd_6yWbdYeGwuaprDqVbj4y0yjqNPuAgqzQypry50g4fzAtxW0dIT_ZpM'
#response = get_query_response(request_url, authorisation_key, payload, user_id ='207736492')
#  207736492
response = get_query_response(request_url, authorisation_key, payload)
# "5c3130dd5c416100010ab2d7"

In [49]:
response.text

'{"results":[{"piScore":84,"name":"Sarah Millican","screenName":"SarahMillican75","profileLocation":"hopefully sitting down","demographics":{"location":{"name":"County Down","population":516000,"geonameId":2651037,"featureCode":"ADMD","codes":{"PCL":"GB","ADM1":"NIR","SUBADM":"2651037"},"coords":{"lat":54.33333,"lon":-5.75},"source":"PROFILE_LOCATION"},"professions":[{"name":"Executive","jobs":["Founder"]},{"name":"Artist","jobs":["Comedian"]}],"interests":["Animals & Pets","Books","Politics"],"accountType":"INDIVIDUAL","accountTypeClassificationProbability":1.0,"gender":"FEMALE","source":"MODEL"},"image":"https://pbs.twimg.com/profile_images/849266627334074369/gTqe5ILO_normal.jpg","id":"20244875","summary":"Comedian, writer, founder of Standard Issue podcast. Feminist, eater, dog & cat mam. Tweets ending with TSM are by my team (Team Sarah Millican).","metrics":{"followers":2065639,"following":4132,"statuses":95715,"favourites":5860,"listed":5166},"verified":true,"lastUpdated":1546814

In [50]:
9999999999999999.0 - \
9999999999999998.0

2.0