<a href="https://colab.research.google.com/github/sayakpaul/GCP-ML-API-Demos/blob/master/Abstract_Parser.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This demo shows how to fetch details about arXiv papers w.r.t a query keyword, parses the abstracts from the details, and then uses GCP's [Text-to-Speech](https://cloud.google.com/text-to-speech) API to create an audio clip of the abstract of choice. Following is the workflow of the demo - 

<div align="center"><img src="https://i.ibb.co/YpMCRXw/image.png"></img></div>

**Note**:

This demo requires to have billing-enabled GCP project and in there the Text-to-Speech API should be enabled. You should also have your GCP Credentials key in `json` format (refer [here](https://cloud.google.com/docs/authentication/getting-started)). I followed the official samples and tutorials of the APIs (which are available at the aforementioned links) to developed this demo.To fetch the paper details I used the [`arxiv`](https://pypi.org/project/arxiv/) Python library that internally uses [arXiv API](https://arxiv.org/help/api/user-manual). 

**Acknowledgements**:

This demo is based on [another cool demo](https://www.linkedin.com/feed/update/urn:li:activity:6706930343590584320/) created by Dale and Kaz. 

Thanks to the [GDE program](https://developers.google.com/programs/experts/) for providing with the GCP credit support which made this demo possible. 

<div align="center"><img src="https://i.ibb.co/ZXtwJjV/Webp-net-resizeimage.png" width="100" height="100"></img></div>

In [None]:
#@title Upload your GCP credentials key to Colab
from google.colab import files
files.upload()

In [None]:
#@title Install Python client libraries
!pip install arxiv
!pip install google-cloud-texttospeech

In [None]:
#@title Set the path to GCP credentials key
import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = '/content/fast-ai-exploration-f32c198aac7e.json' 
!echo $GOOGLE_APPLICATION_CREDENTIALS

/content/fast-ai-exploration-f32c198aac7e.json


In [None]:
#@title Imports
from google.cloud import texttospeech
from IPython.display import Audio
from tqdm.notebook import tqdm
import arxiv
import html

In [None]:
#@title Utility function to fetch 10 most recent papers w.r.t the supplied keyword
# Get interator over query results
def query_with_keywords(query):
    results = arxiv.query(query=query, 
                        max_results=10,
                        max_chunk_results=10, 
                        iterative=False, 
                        prune=True,
                        sort_by="lastUpdatedDate")
    terms = []
    titles = []
    abstracts = []
    for res in tqdm(results):
        if res['arxiv_primary_category']["term"]=="cs.CV" or \
            res['arxiv_primary_category']["term"]=="stat.ML" or \
                res['arxiv_primary_category']["term"]=="cs.LG":
            print(res['arxiv_primary_category']["term"])
            print(res["title"])

            terms.append(res['arxiv_primary_category']["term"])
            titles.append(res["title"])
            abstracts.append(res["summary"])

    return terms, titles, abstracts

In [None]:
#@title Fetch paper title, terms, and abstracts
#@markdown Enter a query keyword below
query_keyword = "moco" #@param {type:"string"}
terms, titles, abstracts = query_with_keywords(query_keyword)

HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

cs.CV
A Framework For Contrastive Self-Supervised Learning And Designing A New
  Approach
cs.CV
Demystifying Contrastive Self-Supervised Learning: Invariances,
  Augmentations and Dataset Biases
cs.CV
Parametric Instance Classification for Unsupervised Visual Feature
  Learning
cs.CV
What makes instance discrimination good for transfer learning?
cs.CV
Momentum Contrast for Unsupervised Visual Representation Learning
cs.CV
Improved Baselines with Momentum Contrastive Learning
cs.CV
Calligraphic Stylisation Learning with a Physiologically Plausible Model
  of Movement and Recurrent Neural Networks



In [None]:
#@title Utility functions for generating audio
#@markdown Courtesy: https://cloud.google.com/text-to-speech/docs/ssml-tutorial
def ssml_to_audio(ssml_text, outfile):
    # Generates SSML text from plaintext.
    #
    # Given a string of SSML text and an output file name, this function
    # calls the Text-to-Speech API. The API returns a synthetic audio
    # version of the text, formatted according to the SSML commands. This
    # function saves the synthetic audio to the designated output file.
    #
    # Args:
    # ssml_text: string of SSML text
    # outfile: string name of file under which to save audio output
    #
    # Returns:
    # nothing

    # Instantiates a client
    client = texttospeech.TextToSpeechClient()

    # Sets the text input to be synthesized
    synthesis_input = texttospeech.SynthesisInput(ssml=ssml_text)

    # Builds the voice request, selects the language code ("en-US") and
    # the SSML voice gender ("MALE")
    voice = texttospeech.VoiceSelectionParams(
        language_code="en-US", ssml_gender=texttospeech.SsmlVoiceGender.MALE
    )

    # Selects the type of audio file to return
    audio_config = texttospeech.AudioConfig(
        audio_encoding=texttospeech.AudioEncoding.MP3
    )

    # Performs the text-to-speech request on the text input with the selected
    # voice parameters and audio file type
    response = client.synthesize_speech(
        input=synthesis_input, voice=voice, audio_config=audio_config
    )

    # Writes the synthetic audio to the output file.
    with open(outfile, "wb") as out:
        out.write(response.audio_content)
        print("Audio content written to file " + outfile)

    return str(outfile)

To keep this demo short, we will only be operating on the first result. 

In [None]:
#@title First, fetch the paper title
title = "First paper is {}".format(titles[0])
ssml = "<speak>{}</speak>".format(
    title.replace("paper is", 'paper is <break time="500ms"/>'))
filename = ssml_to_audio(ssml, "title.mp3")
Audio(filename=filename, autoplay=True)    

Audio content written to file title.mp3


In [None]:
#@title If you haven't heard about the paper yet proceed to generate an audio for the abstract
first_abstract = html.escape(abstracts[0])
ssml = "<speak>{}</speak>".format(
        first_abstract.replace(".", '<s/>')
    )
filename = ssml_to_audio(ssml, "abstract.mp3")
Audio(filename=filename, autoplay=True)

Audio content written to file abstract.mp3
