## Используем сервис MeaningCloud

Сервис [MeaningCloud.com](http://MeaningCloud.com) предоставляет 20000 запросов в месяц, и содержит продвинутые средства анализа естественного языка на нескольких языках.

Вам необходимо войти в сервис с помощью GitHub или зарегистрировшись на нём, после чего вы сразу получите ключ для вызова API.

Попробуем использовать сервис для извлечения ключевых слов и анализа предложений (topic modeling):

In [3]:
import requests
import sys

key = '=== YOUR KEY HERE ==='

def analyze(x,lang='en',topics='a'):
    url = "https://api.meaningcloud.com/topics-2.0"

    payload={
        'key': key,
        'txt': x,
        'lang': lang,  # 2-letter code, like en es fr ...
        'tt': topics
    }

    response = requests.post(url, data=payload)
    return response.json()

analyze('I would love to visit Torino one day; it seems like one of the best cities in Italy!')

{'concept_list': [{'form': 'city',
   'id': '817857ee40',
   'relevance': '100',
   'sementity': {'class': 'class',
    'fiction': 'nonfiction',
    'id': 'ODENTITY_CITY',
    'type': 'Top>Location>GeoPoliticalEntity>City'},
   'semld_list': ['http://en.wikipedia.org/wiki/City',
    'http://ar.wikipedia.org/wiki/مدينة',
    'http://ca.wikipedia.org/wiki/Ciutat',
    'http://cs.wikipedia.org/wiki/Město',
    'http://de.wikipedia.org/wiki/Stadt',
    'http://es.wikipedia.org/wiki/Ciudad',
    'http://fi.wikipedia.org/wiki/Kaupunki',
    'http://fr.wikipedia.org/wiki/Ville',
    'http://he.wikipedia.org/wiki/עיר',
    'http://hi.wikipedia.org/wiki/शहर',
    'http://id.wikipedia.org/wiki/Kota',
    'http://it.wikipedia.org/wiki/Città',
    'http://ja.wikipedia.org/wiki/都市',
    'http://ko.wikipedia.org/wiki/도시',
    'http://nl.wikipedia.org/wiki/Stad',
    'http://no.wikipedia.org/wiki/By',
    'http://pl.wikipedia.org/wiki/Miasto',
    'http://pt.wikipedia.org/wiki/Cidade',
    'http://ro

Интересно, что сервис осуществляет привязку объектов к некоторой [внутренней онтологии](https://www.meaningcloud.com/developer/documentation/ontology#ODTHEME_TOP), а также к внешним ссылкам на Wikipedia.

Применим этот сервис к роману "Анна Каренина". Для начала разобьем текст на абзацы:

In [17]:
text = open('../../../data/akar_en.txt',encoding='utf-8').read()
novel = text[2:].split('\n')[61:]

def bypar(text):
    s = ""
    for x in text:
        if x == "":
            if s!="":
                yield s[:-1]
            s=""
        else:
            s+=x+" "

list(bypar(novel))[:5]

['Happy families are all alike; every unhappy family is unhappy in its own way.',
 'Three days after the quarrel, Prince Stepan Arkadyevitch Oblonsky—Stiva, as he was called in the fashionable world—woke up at his usual hour, that is, at eight o’clock in the morning, not in his wife’s bedroom, but on the leather-covered sofa in his study. He turned over his stout, well-cared-for person on the springy sofa, as though he would sink into a long sleep again; he vigorously embraced the pillow on the other side and buried his face in it; but all at once he jumped up, sat up on the sofa, and opened his eyes.',
 '“Yes, yes, how was it now?” he thought, going over his dream. “Now, how was it? To be sure! Alabin was giving a dinner at Darmstadt; no, not Darmstadt, but something American. Yes, but then, Darmstadt was in America. Yes, Alabin was giving a dinner on glass tables, and the tables sang, _Il mio tesoro_—not _Il mio tesoro_ though, but something better, and there were some sort of little

Выделим первые 5 абзацев:

In [19]:
import itertools as itt
text = "\n".join(itt.islice(bypar(novel),5))
print(text)

Happy families are all alike; every unhappy family is unhappy in its own way.
Three days after the quarrel, Prince Stepan Arkadyevitch Oblonsky—Stiva, as he was called in the fashionable world—woke up at his usual hour, that is, at eight o’clock in the morning, not in his wife’s bedroom, but on the leather-covered sofa in his study. He turned over his stout, well-cared-for person on the springy sofa, as though he would sink into a long sleep again; he vigorously embraced the pillow on the other side and buried his face in it; but all at once he jumped up, sat up on the sofa, and opened his eyes.
“Yes, yes, how was it now?” he thought, going over his dream. “Now, how was it? To be sure! Alabin was giving a dinner at Darmstadt; no, not Darmstadt, but something American. Yes, but then, Darmstadt was in America. Yes, Alabin was giving a dinner on glass tables, and the tables sang, _Il mio tesoro_—not _Il mio tesoro_ though, but something better, and there were some sort of little decanters

Проанализируем их как единый текст:

In [20]:
res = analyze(text)

Результат запишем в файл и посмотрим:

In [25]:
import json
with open('res.json','w',encoding='utf-8') as f:
    json.dump(res,f)

Видно, что текста анализируется целиком как единое целое - это удобно для выделения сущностей, но не удобно для анализа отдельных предложений. Поэтому вызовем сервис отдельно для предложения:

In [32]:
text = next(itt.islice(bypar(novel),1,2))
lines = text.split('. ')
lines

['Everything was in confusion in the Oblonskys’ house',
 'The wife had discovered that the husband was carrying on an intrigue with a French girl, who had been a governess in their family, and she had announced to her husband that she could not go on living in the same house with him',
 'This position of affairs had now lasted three days, and not only the husband and wife themselves, but all the members of their family and household, were painfully conscious of it',
 'Every person in the house felt that there was no sense in their living together, and that the stray people brought together by chance in any inn had more in common with one another than they, the members of the family and household of the Oblonskys',
 'The wife did not leave her own room, the husband had not been at home for three days',

Проанализируем одно предложение. В этом примере: *The wife had discovered that the husband was carrying on an intrigue with a French girl, who had been a governess in their family, and she had announced to her husband that she could not go on living in the same house with him*

In [35]:
res = analyze(lines[1])

In [36]:
res

{'concept_list': [{'form': 'husband',
   'id': '6a17861b62',
   'relevance': '100',
   'sementity': {'class': 'class',
    'fiction': 'nonfiction',
    'id': 'ODENTITY_PERSON',
    'type': 'Top>Person'},
   'semld_list': ['http://en.wikipedia.org/wiki/Husband',
    'http://ar.wikipedia.org/wiki/زوج',
    'http://ca.wikipedia.org/wiki/Marit',
    'http://he.wikipedia.org/wiki/בעל_(משפחה)',
    'http://id.wikipedia.org/wiki/Suami',
    'http://it.wikipedia.org/wiki/Marito',
    'http://ko.wikipedia.org/wiki/남편',
    'http://no.wikipedia.org/wiki/Ektemann',
    'http://ro.wikipedia.org/wiki/Soț',
    'http://ru.wikipedia.org/wiki/Муж',
    'http://sv.wikipedia.org/wiki/Make_(äktenskap)',
    'http://zh.wikipedia.org/wiki/丈夫',
    'http://d-nb.info/gnd/4070668-0',
    'sumo:Human'],
   'variant_list': [{'endp': '39', 'form': 'husband', 'inip': '33'},
    {'endp': '166', 'form': 'husband', 'inip': '160'}]}],
 'entity_list': [],
 'money_expression_list': [],
 'other_expression_list': [],
 'q

Выделим из этого предложения отдельные связи, отвечающие за атомарные фрагменты смысла:

In [48]:
def complist(l):
    return ', '.join([x['form'] for x in l])

for r in res['relation_list']:
    print(f"{r['subject']['lemma_list']} -> {r['verb']['lemma_list']} -> {complist(r['complement_list'])}")

['wife'] -> ['discover'] -> that the husband was carrying on an intrigue with a French girl
['husband'] -> ['be'] -> a governess, in their family, was carrying, who had been a governess in their family
['husband'] -> ['carry'] -> on an intrigue with a French girl
['she'] -> ['announce'] -> to her husband
['she'] -> ['go on'] -> 
