# 如何使用和开发微信聊天机器人的系列教程
# A workshop to develop & use an intelligent and interactive chat-bot in WeChat

### WeChat is a popular social media app, which has more than 800 million monthly active users.

<img src='http://www.kudosdata.com/wp-content/uploads/2016/11/cropped-KudosLogo1.png' width=30% style="float: right;">
<img src='reference/WeChat_SamGu_QR.png' width=10% style="float: right;">

### http://www.KudosData.com

by: Sam.Gu@KudosData.com


May 2017 ========== Scan the QR code to become trainer's friend in WeChat ========>>

### 第三课：自然语言处理
### Lesson 3: Natural Language Processing
* 消息文字转成语音 (Speech synthesis: text to voice)
* 语音转换成消息文字 (Speech recognition: voice to text)
* 消息文字的多语言互译 (Text based language translation)

### Using Google Cloud Platform's Machine Learning APIs

From the same API console, choose "Dashboard" on the left-hand menu and "Enable API".

Enable the following APIs for your project (search for them) if they are not already enabled:
<ol>
<li> Google Translate API </li>
<li> Google Cloud Vision API </li>
<li> Google Natural Language API </li>
<li> Google Cloud Speech API </li>
</ol>

Finally, because we are calling the APIs from Python (clients in many other languages are available), let's install the Python package (it's not installed by default on Datalab)

In [3]:
# Copyright 2016 Google Inc.
# Licensed under the Apache License, Version 2.0 (the "License"); 

# !pip install --upgrade google-api-python-client

### 导入需要用到的一些功能程序库：

In [4]:
import io, os, subprocess, sys, time, datetime, requests, itchat
from itchat.content import *
from googleapiclient.discovery import build

█

### GCP Machine Learning API Key

First, visit <a href="http://console.cloud.google.com/apis">API console</a>, choose "Credentials" on the left-hand menu.  Choose "Create Credentials" and generate an API key for your application. You should probably restrict it by IP address to prevent abuse, but for now, just  leave that field blank and delete the API key after trying out this demo.

Copy-paste your API Key here:

In [2]:
# Here I read in my own API_KEY from a file, which is not shared in Github repository:
with io.open('../../API_KEY.txt') as fp: 
    for line in fp: APIKEY = line

# You need to un-comment below line and replace 'APIKEY' variable with your own GCP API key:
# APIKEY="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

In [None]:
# Below is for GCP Language Tranlation API
service = build('translate', 'v2', developerKey=APIKEY)
# Below is for GCP Speech API
# sservice = build('speech', 'v1beta1', developerKey=APIKEY)
speech_service = build('speech', 'v1', developerKey=APIKEY)

### 多媒体二进制base64码转换 (Define image pre-processing functions)

In [5]:
# Import the base64 encoding library.
import base64
# Pass the image data to an encoding function.
def encode_image(image_file):
    with io.open(image_file, "rb") as image_file:
        image_content = image_file.read()
# Python 2
    if sys.version_info[0] < 3:
        return base64.b64encode(image_content)
# Python 3
    else:
        return base64.b64encode(image_content).decode('utf-8')

# Pass the audio data to an encoding function.
def encode_audio(audio_file):
    with io.open(audio_file, 'rb') as audio_file:
        audio_content = audio_file.read()
# Python 2
    if sys.version_info[0] < 3:
        return base64.b64encode(audio_content)
# Python 3
    else:
        return base64.b64encode(audio_content).decode('utf-8')


### 机器智能API接口控制参数 (Define control parameters for API)

In [17]:
# control parameter for Image API:
parm_image_maxResults = 10 # max objects or faces to be extracted from image analysis

# control parameter for Language Translation API:
parm_translation_origin_language = '' # original language in text: to be overwriten by TEXT_DETECTION
parm_translation_target_language = 'zh' # target language for translation: Chinese

# control parameter for Language Translation API:
# parm_speech_origin_language = 'en-US' # speech API 'voice to text' language
parm_speech_origin_language = 'en' # speech API 'voice to text' language


### * 消息文字转成语音 (Speech synthesis: text to voice)

### * 语音转换成消息文字 (Speech recognition: voice to text)

The Speech API can work on streaming data, audio content encoded and embedded directly into the POST message, or on a file on Cloud Storage.

In [7]:
#    msg.download(msg.fileName)
#    print('\nDownloaded image file name is: %s' % msg['FileName'])

#    audio_file_input = msg['FileName']
#    audio_type = ['flac', 'wav']

# Running Speech API
def KudosData_voice_to_text(audio_file_input, audio_type):
    audio_file_output = str(audio_file_input) + '.' + str(audio_type)
    print('audio_file_input  : %s' % audio_file_input)
    print('audio_file_output : %s' % audio_file_output)
    
    # convert mp3 file to target GCP audio file:

# remove audio_file_output, is exist
    retcode = subprocess.call(['rm', audio_file_output])
    # print(retcode)
    
# GCP: use avconv to convert audio
#     retcode = subprocess.call(['avconv', '-i', audio_file_input, '-ac', '1', audio_file_output])
# VM : use ffmpeg to convert audio
    FNULL = io.open(os.devnull, "w") # supress os/linux command output
    retcode = subprocess.call(['ffmpeg', '-i', audio_file_input, '-ac', '1', audio_file_output], 
                              stdout=FNULL, stderr=subprocess.STDOUT)
    # print(retcode)

    # Call GCP Speech API:
    # response = speech_service.speech().syncrecognize(
    response = speech_service.speech().recognize(
        body={
            'config': {
#                 'encoding': 'LINEAR16',
#                 'sampleRateHertz': 16000,
                'languageCode': parm_speech_origin_language
            },
            'audio': {
                'content': encode_audio(audio_file_output) # base64 of converted audio file, for speech recognition
                }
            }).execute()    
    print('Compeleted: Speech API: Voice -> Text ...')
    return response

In [24]:
##########################
# main()
##########################

#    msg.download(msg.fileName)
#    print('\nDownloaded image file name is: %s' % msg['FileName'])

#    audio_file_input = msg['FileName']
#    audio_type = ['flac', 'wav']

response = KudosData_voice_to_text('reference/eng_sample.mp3', 'flac')
# response = KudosData_voice_to_text('reference/170512-061221.mp3', 'flac')
# response = KudosData_voice_to_text('reference/eng_sample.mp3', 'wav')

audio_file_input  : reference/eng_sample.mp3
audio_file_output : reference/eng_sample.mp3.flac


timeout: The write operation timed out

In [19]:
response

{}

In [20]:
if response != {}:
    print (response['results'][0]['alternatives'][0]['transcript'])
    print ('( confidence: %f )' % response['results'][0]['alternatives'][0]['confidence'])

In [None]:
##########################
# main()
##########################

#    msg.download(msg.fileName)
#    print('\nDownloaded image file name is: %s' % msg['FileName'])

#    audio_file_input = msg['FileName']
#    audio_type = ['flac', 'wav']

# response = KudosData_voice_to_text('reference/eng_sample.mp3', 'flac')
response = KudosData_voice_to_text('reference/eng_sample.mp3', 'wav')

audio_file_input  : reference/eng_sample.mp3
audio_file_output : reference/eng_sample.mp3.flac


In [None]:
if response != {}:
    print (response['results'][0]['alternatives'][0]['transcript'])
    print ('( confidence: %f )' % response['results'][0]['alternatives'][0]['confidence'])

### * 消息文字的多语言互译 (Text based language translation)

In [None]:
# Running Vision API
# 'TEXT_DETECTION'
def KudosData_TEXT_DETECTION(image_base64, API_type, maxResults):
    vservice = build('vision', 'v1', developerKey=APIKEY)
    request = vservice.images().annotate(body={
        'requests': [{
                'image': {
#                     'source': {
#                         'gcs_image_uri': IMAGE
#                     }
                      "content": image_base64
                },
                'features': [{
                    'type': API_type,
                    'maxResults': maxResults,
                }]
            }],
        })
    responses = request.execute(num_retries=3)
    image_analysis_reply = u'\n[ ' + API_type + u' 文字提取 ]\n'
    # 'TEXT_DETECTION'
    if responses['responses'][0] != {}:
        image_analysis_reply += u'----- Start Origin Text -----\n'
        image_analysis_reply += u'( Original Language 原文: ' + str(responses['responses'][0]['textAnnotations'][0]['locale']) \
        + ' )\n'        
        image_analysis_reply += responses['responses'][0]['textAnnotations'][0]['description'] + '----- End Origin Text -----\n'

        ##############################################################################################################
        #                                        translation of detected text                                        #
        ##############################################################################################################
        parm_translation_origin_language = str(responses['responses'][0]['textAnnotations'][0]['locale'])
        # Call translation if parm_translation_origin_language is not parm_translation_target_language
        if parm_translation_origin_language != parm_translation_target_language:
            inputs=[responses['responses'][0]['textAnnotations'][0]['description']] # TEXT_DETECTION OCR results only
            outputs = service.translations().list(source=parm_translation_origin_language, 
                                                  target=parm_translation_target_language, q=inputs).execute()
            image_analysis_reply += u'\n----- Start Translation -----\n'
            image_analysis_reply += u'( Target Language 译文: ' + parm_translation_target_language + ' )\n'
            image_analysis_reply += outputs['translations'][0]['translatedText'] + '\n' + '----- End Translation -----\n'
            print('Compeleted: Translation    API ...')
        ##############################################################################################################
        
    return image_analysis_reply

### 用微信App扫QR码图片来自动登录

In [None]:
itchat.auto_login(hotReload=True) # hotReload=True: 退出程序后暂存登陆状态。即使程序关闭，一定时间内重新开启也可以不用重新扫码。
# itchat.auto_login(enableCmdQR=-2) # enableCmdQR=-2: 命令行显示QR图片

In [None]:
@itchat.msg_register([RECORDING], isGroupChat=True)
@itchat.msg_register([RECORDING])
def download_files(msg):
    parm_translation_origin_language = 'zh' # will be overwriten by TEXT_DETECTION
    msg.download(msg.fileName)
    print('\nDownloaded audio file name is: %s' % msg['FileName'])
    
    ##############################################################################################################
    #                                          call audio analysis APIs                                          #
    ##############################################################################################################
    
    audio_analysis_reply = u'[ Audio Analysis Results 音频处理结果 ]\n'

    # 1. Text to Voice:
#     audio_analysis_reply += KudosData_voice_to_text(msg['FileName'], 'flac')
#     audio_analysis_reply += KudosData_LABEL_DETECTION(audio_base64, 'LABEL_DETECTION', parm_audio_maxResults)
    # 2. Voice to Text:
    audio_analysis_reply += u'\n[ Voice -> Text 语音识别 ]\n'
    response = KudosData_voice_to_text(msg['FileName'], 'flac')
#     response = KudosData_voice_to_text(msg['FileName'], 'wav')
    if response != {}:
        print (response['results'][0]['alternatives'][0]['transcript'])
        print ('( confidence: %f )' % response['results'][0]['alternatives'][0]['confidence'])
        audio_analysis_reply += response['results'][0]['alternatives'][0]['transcript'] + '\n'
        audio_analysis_reply += '( confidence: ' + str(response['results'][0]['alternatives'][0]['confidence']) + ' )\n'
    
    # 3. Text Translation:
#     audio_analysis_reply += KudosData_voice_to_text(msg['FileName'], 'flac')
#     audio_analysis_reply += KudosData_LOGO_DETECTION(audio_base64, 'LOGO_DETECTION', parm_audio_maxResults)

    print('Compeleted: Audio Analysis API ...')
    
    return audio_analysis_reply

In [None]:
itchat.run()

In [None]:
# interupt kernel, then logout
itchat.logout() # 安全退出

### 恭喜您！已经完成了：
### 第三课：自然语言处理
### Lesson 3: Natural Language Processing
* 消息文字转成语音 (Speech synthesis: text to voice)
* 语音转换成消息文字 (Speech recognition: voice to text)
* 消息文字的多语言互译 (Text based language translation)

### 下一课是:
### 第三课：自然语言处理
### Lesson 3: Natural Language Processing
* 消息文字转成语音 (Speech synthesis: text to voice)
* 语音转换成消息文字 (Speech recognition: voice to text)
* 消息文字的多语言互译 (Text based language translation)

<img src='http://www.kudosdata.com/wp-content/uploads/2016/11/cropped-KudosLogo1.png' width=30% style="float: right;">
<img src='reference/WeChat_SamGu_QR.png' width=10% style="float: left;">

