-
Notifications
You must be signed in to change notification settings - Fork 5
Home
Welcome to the SearchBullet wiki!
This repository is the result of the individual work part of a course in the project: search by voice. The project 'search by voice' was open to students from the Bauhaus Universität Weimar and supervised by Johannes Kiesel from the Webis group.
In advance the five students of the project developed two Amazon Alexa skills; involving one skill which tells Chuck Norris Jokes and one skill which uses the sentence completion engine netspeak.org devloped to complete phrases. During the development of these skills some shortcomings of the alexa skills kit limited the quality of our skills significantly. In particular is the speech to text recognition of the Amazon Alexa not good enough for some use cases (including ours).
This is why I decided to explore the possibilities of the Google Home instead. My experience with Voice Assitants so far is that they are able to answer a surprising amount of questions. However if the assistants do not know the answer to your question it is not possible to perform a normal web search instead and 'search by voice'. Searchbullet tries to bridge this gap until assistants can reliably answer everything. It allows to search on google.com or websites of the users choice and pushes the link of the search results with a push notification to the users smartphone.
Google calls their skills actions. And opposed to Alexa these actions are more of a separate assistant that the Google Home talks to and less of a skill that the Alexa acquires. Google Actions reply with a different voice than the default Google Home voice; they start and end with a beep and say 'Hi' and 'bye'. It is possible to build actions with the Actions Sdk App and with Dialogflow a toolkit which supports multiple Assistant toolkits and was acquired by Google. Whereas Actions Sdk requires coding Dialogflow allows to develop whole actions without a single line of code. Moreover Actions Sdk does not provide tools for intent mapping thus it should only be the choice for very simple actions and actions where the developer wants to do the whole natural language processing on his own.
For the action part that requires access to database, a server logic etc. fulfillments are used which use webhooks to communicate with your server. To programme the fulfillment part of this app the relatively new Jovo framework was used. Jovo supports already lot of functionality that we had to add manually to our Alexa app: Speech builder, Session Data, follow up states but more important it aims to become a platform independent assistant frameworks and allows to develop both for Alexa and Google Home.
On Dialogflow the intents were created:
- Default Fallback Intent
- Default Welcome Intent
- finished
- go-on
- just-search
- result-detail
- search-website
- url-domain
The Default Fallback Intent replies if something went wrong. It is fully configured in Dialogflow and is basically just a list of error messages
The default welcome intent provides the general **greeting **and signals that the following actions will happen outside of the normal Google Home interaction (Google wants this). Is triggered by the Dialogflow welcome event.
The finished intent gets triggered when the interaction with the action is finished. Is trained by user input such as "I am done", "Quit" etc.
The go-on intent allows the user to get further than the first three search results. Go-on uses the Dialogflow context 'goOn' so that it gets only triggered after a search was already performed by user input such as "go on", "yes".
The just-search intent was added so that users can search the whole web with google.com without specifying that they want to google. Just search uses user input "search for _____", "google _____" etc. and the entity @sys.any to catch search queries. This is the major advantage of Google Home over Alexa, which does not allow free form text input. It sets the context 'goOn' with a lifetime of 1. It uses a webhook for fulfillment.
The result-detail intent reads out a longer snippet of the chosen search result and pushes the search result to the smartphone. It accepts the entity @sys.ordinal as input with user input such as: 'the second result'. It uses a webhook for fulfillment.
The search-website was intended to be the default intent and allows to specify a search query and search on a specific website. The search website accepts similar to the just-search intent @sys.any. Furthermore it has two entities: @sys.url which is a default entity that expects urls with a domain ending (.com, .de, .co.uk). The other entity @domain is a custom entity with popular websites. It consists of the url together with domain ending but has a syonoym for this domain entity added. Thus users can say 'search for iPhone repair on apple' or 'search for edward snowden files on new york times'. This solution proved to give the best intent and entity recognition over other solutions such as adding the domain ending within the fulfillment.
The url-domain intent is a follow up intent which asks the user to specify the website to search on if it is either not said or could not be understood. If the url or domain are not specified by the user after the search-website intent the user is asked again for the @sys.url or @domain. A fulfillment is used to complete the intent.
During development ngrok was used to tunnel the webhook request to the localhost.
The jovo-framework was used with node.js to set sessions attributes, map to intents and build the output speech. It simplifies the code enormously.
A few trials to scrape the web manually were soon canceled as the manual work required to support multiple web sites would break the scope of this individual assignment. Unfortunately a lot of Apis have different issues: duckduckgo api does not allow full web searches, yahoo will be cancelled soon, bing does not allow to search over all websites or simply cost (too much) money. Google Custom search can be configured to search across the whole web. The node package google-search was used to perform the search queries in clean simplified form. Currently for each search, go-on search and result-detail individual search queries are performed. If too many request to google search become an issue the search results could be stored in a short term database on the node server. However this will also increase the complexity of the fulfillment code and demand on this server. The google search queries are retrieved as JSON objects which are internally transferred into javascript objects. As the JSON support of javascript and node are very good the data can be filtered without a lot of extra work.
Pushbullet is the most used service to receive smartphone notifications on desktop machines. It supports iOs, Android, Windows and multiple browsers, OAuth authentification and has a well documented api with multiple nodejs modules.
The Google Assistant integration is straightforward with Dialogflow. It has to be decided between a few voices and the Google Assistant integration is needed to test the speech output as the Dialogflow test console does not interpret the SSML syntax correctly.