In [1]:
from IPython.display import Image, display

# Language processing in the cloud

In [2]:
Image(url='https://i.pinimg.com/564x/7f/7e/1d/7f7e1dd5970675a0de86f147f234578d.jpg')

# Language processing in the cloud

- Software as a Service (SaaS) is a very popular way of giving access to software
- The software is run in the cloud and users pay some kind of  subscription to access it
- Great way to develop (commercial) NLP applications that mashup information from several services
- Can lead to scalable applications
- There are already several established provides of APIs that allow language processing (usually branded text analytics)
- Difficult to assess how accurate these tools are

Read more in Dale, R. (2015). NLP meets the cloud. *Natural Language Engineering*, 21(04), 653–659. <a href="http://doi.org/10.1017/S1351324915000200" target="_blank">http://doi.org/10.1017/S1351324915000200</a>

In [3]:
Image(url="https://i.pinimg.com/564x/a4/c9/d4/a4c9d4682c4d7a1d729c7060dbca217d.jpg")

# Language processing in the cloud

- assumes that the processing you need was made available via an API accessible via HTTP(s)
- the text to be processed is submitted using the HTTP protocol (careful about privacy issues)
- the result is returned using HTTP protocol usually formatted using JSON
- to access an API you usually need to have a key and it may not be free


# Typical steps for accessing a SaaS

1. Import a relevant module (e.g. requests)
2. Prepare the request by specifying the key and the relevant parameters
3. Make the HTTP request and save the response
4. Process the response and parse any JSON associated to extract the actual information

# Using MeaningCloud

- One of the big players in the field of SaaS for NLP
- Offer a variety of services
- Chosen for practical reasons: easy to setup an account; 40,000 calls for month free; easy to access (no installation necessary, no credit card required)

<a href="https://www.meaningcloud.com">https://www.meaningcloud.com</a>

# 1. Import the relevant module

- In python the module requests is widely used
- It is referred as HTTP for humans because it makes the process easy
- To use all that's necessary is to import it:

```python
import requests
```

- If you get an error while you try to import it, you may need to install it on your computer. Details on how to install available at <a href="http://docs.python-requests.org/en/master/user/install/#install" target="_blank">http://docs.python-requests.org/en/master/user/install/#install</a>

- Documentation available <a href="http://docs.python-requests.org/en/master/" target="_blank">http://docs.python-requests.org/en/master/</a>

In [4]:
import requests

# 2. Initialise the necessary variables

- Initialise the url variable with the relevant URL
- The URLs for different services can be found by clicking on the relevant sevice from <a href="https://www.meaningcloud.com/developer/apis" target="_blank">https://www.meaningcloud.com/developer/apis</a> and selecting 

```python
url = "http://api.meaningcloud.com/lang-2.0"
```
- specify your key. It can be copied from <a href="https://www.meaningcloud.com/developer/account/subscription" target="_blank">https://www.meaningcloud.com/developer/account/subscription</a>

```python
key="...."
```

- specify the text you want to process

```python
text="..."
```

In [5]:
url = "http://api.meaningcloud.com/lang-2.0"
key=input("Please enter the key:")
text = "Sąd składał się początkowo z przewodniczącego i siedmiu sędziów. Liczba ta została następnie poszerzona i obecnie na Sąd Najwyższy składa się z przewodniczącego i dwudziestu pięciu sędziów. Aby zostać sędzią Sądu Najwyższego należy posiadać indyjskie obywatelstwo oraz być sędzią sądu stanowego przez co najmniej 5 lat, adwokatem sądu stanowego przez co najmniej 10 lat, lub być wybitnym prawnikiem (decyduje opinia prezydenta)."

Please enter the key:87be92031adf44e7b473e30d9974b603


# 3. Prepare and make the request

- prepare the headers for the request (the exact format is specified in the documentation)

```python
headers = {'content-type': 'application/x-www-form-urlencoded'}
```
- specify the parameters for the request (using string formatting in this case). The number of parameters and their meaning is different from service to service, and it is specified in the documentation

```python
payload = "key=%s&txt=%s" % (key, text.encode("utf-8"))
```

- if the text you are processing is not in UTF-8 format, you will need to encode it to UTF-8

- make the request

```python
response = requests.request("POST", url, data=payload, headers=headers)
```

In [6]:
payload = "key=%s&txt=%s" % (key, text.encode("utf-8"))
headers = {'content-type': 'application/x-www-form-urlencoded'}

response = requests.request("POST", url, data=payload, headers=headers)

# 4. Process the returned JSON

- usually the answer is returned as a JSON (JavaScript Object Notation) object 
- the structure is different from service to service.
- In the case of Meaning Cloud it has a part which indicates the status (e.g. error messages, remaining credits) and a part with the analysis

# Processing JSON in python

- JSON (JavaScrip Object Notation) is a lightweight data interchange format inspired by JavaScript object literal syntax
- It was designed for human-readable data interchange
- It is easy to read and write
- Data is represented in name/value pairs
- Curly braces hold objects and each name is followed by ':'(colon), the name/value pairs are separated by , (comma).
- Square brackets hold arrays and values are separated by ,(comma).


# Example of JSON 

- JSON returned by the language identification service

In [7]:
print(response.text)

{"status":{"code":"0","msg":"OK","credits":"1","remaining_credits":"39990"},"language_list":[{"language":"pl","relevance":"100","name":"Polish","iso639-3":"pol","iso639-2":"pl"}]}


# Processing JSON in python

- there is a variety of libraries, but we will use the default one

```python
import json
```

- parse the answer we need to load it. Make sure you use **loads** not **load**.
```python
parsed_json = json.loads(response.text)
```
- objects can be processed exactly the same way as dictionaries, and arrays like python lists.


In [8]:
import json 
parsed_json = json.loads(response.text)
for key in parsed_json:
    print(key, "=>", parsed_json[key])

status => {'code': '0', 'msg': 'OK', 'credits': '1', 'remaining_credits': '39990'}
language_list => [{'language': 'pl', 'relevance': '100', 'name': 'Polish', 'iso639-3': 'pol', 'iso639-2': 'pl'}]


In [9]:
status = parsed_json["status"]
for key in status:
    print(key, "=>", status[key])

code => 0
msg => OK
credits => 1
remaining_credits => 39990


# Exercise

1. Create an account on Meaning Cloud
2. Using the above steps call one of the APIs available other than the Language Identification. Read the documentation to understand what the service does and which is its input.
3. Display the information that you find meaningful. 

# Further reading

- Tutorial about JSON: <a href="https://www.tutorialspoint.com/json/index.htm" target="_blank">https://www.tutorialspoint.com/json/index.htm</a>
- Learn more about the services offered by Meaning Cloud: <a href="https://www.meaningcloud.com/developer/documentation" target="_blank">https://www.meaningcloud.com/developer/documentation</a>
- Read about the standard JSON library in python: <a href="https://docs.python.org/3.6/library/json.html" target="_blank">https://docs.python.org/3.6/library/json.html</a>
