Writing intent tests for skills

Åke edited this page Jun 11, 2018 · 6 revisions

The mycroft skills repo uses an automated testing framework to test the skills. This will test a submitted skill together with the deafault skills and should help identifying conflicts.

To help the skill team review the submitted skills you can create intent tests. An intent test is a test which takes an utterance as input and checks for an expected output.

For example the utterance "What's the capital of France" should generate the response "Paris".

This helps the skill team that the skill is working in the way intended by the skill author.

The intent test file

The above test case can be written in an intent.json test file as

{
  "utterance": "what's the capital of france",
  "expected_response": "paris"
}

"utterance" is the utterance that should be tested and "expected_response" is what mycroft should reply.

This file should be placed in the skills test/intent folder and the filename should end with .intent.json to be picked up by the test framework. test/intent/capitalOfFrance.intent.json may be a good name for the test above.

The tools

in the mycroft-core repository a couple of test scripts are available in the test/integrationtests/skills/ folder. To run these, make sure the virtualenvironement for mycroft is activated. This is done by running source .venv/bin/activate from the mycroft-core folder.

If you have an old installation the packages listed in test-requirements.txt might not have been installed when you ran dev_setup.sh. To make sure these are installed run pip install -r test-requirements.txt

The test scripts are:

discover_test.py - A pytest script to run all tests in a skills folder. pytest discover_test.py will find all skill tests in the default skills folder.

single_test.py - Run tests for a single skill. python single_test.py /opt/mycroft/skills/mycroft-datetime.mycroftai will run all tests for The datetime skill.

skill_developers_testrunner.py - script that can be copied to a skill folder and run any tests for that skill.

cd /opt/mycroft/your_new_skill
python skill_developers_testrunner.py

An example skill

Below is a simple skill to play my favourite c64 remixes from https://remix.kwed.org.

from mycroft import MycroftSkill, intent_handler
from mycroft.skills.audioservice import AudioService
from mycroft.audio import wait_while_speaking
from adapt.intent import IntentBuilder

track_dict = {
    'bomb jack': 'http://remix.kwed.org/files/RKOfiles/Chronblom%20-%20Bomb%20Jack%20subtune%206%20(violin%20version).mp3',
    'druid': 'http://remix.kwed.org/files/RKOfiles/Revel%20Craft%20-%20Druid.mp3',
    'crazy comets':  'http://remix.kwed.org/files/RKOfiles/Makke%20-%20Crazy%20Comets%20(Komet%20Non-Stop).mp3',
    'boulder dash': 'http://remix.kwed.org/files/RKOfiles/Mahoney%20-%20BoulderDash%20(Commodore%2069%20mix).mp3',
    'garfield': 'http://remix.kwed.org/files/RKOfiles/Reyn%20Ouwehand%20-%20Garfield.mp3'
}

class C64RemixSkill(MycroftSkill):
    """ Skill for playing Åke's favorite C64 Remixes. """
    def initialize(self):
        """ Connect to the mycroft audio service. """
        self.audio_service = AudioService(self.emitter)

    @intent_handler(IntentBuilder('').require('Play').require('Best') \
                    .require('Commodore'))
    def handle_best(self, message):
        self.speak_dialog('PlayBest')
        wait_while_speaking()
        self.audio_service.play(track_dict['garfield'])

    @intent_handler(IntentBuilder('').require('Play').require('All') \
                    .require('Commodore'))
    def handle_all(self, message):
        # Make a list of all track urls
        tracks = list(track_dict.values())
        self.speak_dialog('PlayAll')
        wait_while_speaking()
        self.audio_service.play(tracks)


def create_skill():
    return C64RemixSkill()

The entire source with dialog files and vocab files can be found here.

It basically has two commands, "play the best Commodore remix" and "play the top Commodore remixes". The first plays a single track and the second enqueues all five songs and starts playing.

We can write the following two test cases:

test/intent/play.best.intent.json:

{
  "utterance": "play the best commodore remix",
  "expected_dialog": "PlayBest"
}

and

test/intent/play.top.intent.json:

{
  "utterance": "play the top five commodore remixes",
  "expected_dialog": "PlayAll"
}

Pretty similar to the first basic example but the "expected_dialog" entry is new. This is similar to the "expected_response" but instead of taking a sentence it takes a dialog file. The test will check that the spoken sentence by mycroft is found in the specified dialog file. for PlayAll this will match against the dialog/en-us/PlayAll.dialog file. It contains

Playing the top commodore 64 remixes
Playing my favorite c64 remixes

expected_dialog also allows for parametrized dialogs like in the datetime skill where one of the dialogs contain It is {{time}}. In this case the script will accept a spoken reply of "It is " followed by anything.

Ok now we have our basic tests, let's run them:

 ~/projects/python/mycroft-core$ source .venv/bin/bash
(.venv) ~/projects/python/mycroft-core$ cd test/integrationtests/skills
(.venv) ~/projects/python/mycroft-core/test/integrationtests/skills$ python single_test.py /opt/mycroft/skills/best-c64-remixes

This produces a lot of output ending in

----------------------------------------------------------------------
Ran 2 tests in 4.829s

OK

If a test should fail the output would be something along the lines of

----------------------------------------------------------------------
Ran 2 tests in 4.568s

FAILED (failures=1)

With the tests passing for our skill we can run it together with all skills to make sure our skill doesn't interfere. To do this run

(.venv) ~/projects/python/mycroft-core/test/integrationtests/skills$ pytest discover_test.py /opt/mycroft/skills

and pytest will scan for tests and then start running the them:

============================= test session starts ==============================
platform linux -- Python 3.5.2, pytest-3.5.0, py-1.5.3, pluggy-0.6.0
rootdir: /, inifile:
plugins: cov-2.5.1
collected 84 items                                                             

discover_tests.py ...................................................... [ 65%]
...........................                                              [100%]
=================== 82 passed in 352.01 seconds ====================

Mocking your tests

For cases where the skill needs to access an online resource or require logins or something it might be troublesome getting the skill to run consistently it might be good to "mock" the calls to the external resource. To mock something means to replace an api call or similar with a faked copy.

Here's a modified version of above skill using a service:

class BestService():
    track_dict = {
        'bomb jack': 'http://remix.kwed.org/files/RKOfiles/Chronblom%20-%20Bomb%20Jack%20subtune%206%20(violin%20version).mp3',
        'druid': 'http://remix.kwed.org/files/RKOfiles/Revel%20Craft%20-%20Druid.mp3',
        'crazy comets':  'http://remix.kwed.org/files/RKOfiles/Makke%20-%20Crazy%20Comets%20(Komet%20Non-Stop).mp3',
        'boulder dash': 'http://remix.kwed.org/files/RKOfiles/Mahoney%20-%20BoulderDash%20(Commodore%2069%20mix).mp3',
        'garfield': 'http://remix.kwed.org/files/RKOfiles/Reyn%20Ouwehand%20-%20Garfield.mp3'
    }

    def __init__(self):
        self.logged_in = False

    def login(self, username, password):
        """ Log in using username and password. """
        if username == 'hello' and password == 'there':
            self.logged_in = True
        return self.logged_in

    def get_top_five(self):
        """ Get the top five tracks as dictionary. """
        if not self.logged_in:
            raise Exception('Not Logged in!')

        return self.track_dict

    def get_best(self):
        """ Get url of the best track. """
        if not self.logged_in:
            raise Exception('Not Logged in!')

        return self.track_dict['garfield']


class C64RemixSkill(MycroftSkill):
    """ Skill for playing Åke's favorite C64 Remixes. """
    def initialize(self):
        """ Connect to the mycroft audio service. """
        self.audio_service = AudioService(self.emitter)
        self.best = BestService()

    @intent_handler(IntentBuilder('').require('Play').require('Best') \
                    .require('Commodore'))
    def handle_best(self, message):
        # Log in to service if needed
        if not self.best.logged_in:
            self.best.login(self.settings.get('user', ''),
                            self.settings.get('password', ''))

        self.speak_dialog('PlayBest')
        wait_while_speaking()
        self.audio_service.play(self.best.get_best())

    @intent_handler(IntentBuilder('').require('Play').require('All') \
                    .require('Commodore'))
    def handle_all(self, message):
        # Log in to service if needed
        if not self.best.logged_in:
            self.best.login(self.settings.get('user', ''),
                            self.settings.get('password', ''))
        # Make a list of all track urls
        tracks = list(self.best.get_top_five().values())
        self.speak_dialog('PlayAll')
        wait_while_speaking()
        self.audio_service.play(tracks)


def create_skill():
    return C64RemixSkill()

This version of the example skill can be downloaded here

When invoked the skill tries to login to the BestService™ using username and password stored in the settings.json of the skill. (Could be from fetched from home.mycroft.ai for example). Then after that it tries to fetch the requested data.

Running single_test.py when username and password are available in the settings works without problem, but with out them the tests fails. To allow a 3rd-party to test the skill the service can be mocked. Python provides unittest.mock with functions and objects helping with this.

To add a mocking to your test the test framework gives you the option to add a special test_runner. This is a function that you provide that can execute code before and after running a test.

To create a test runner create a __init__.py file in the test/ folder of your skill. The basic content should be

from test.integrationtests.skills.skill_tester import SkillTest

def test_runner(skill, example, emitter, loader):
    return SkillTest(skill, example, emitter).run(loader)

Above will just run the test without any modifications, but to make the test above work without password we can extend it to the following

from unittest.mock import MagicMock
from test.integrationtests.skills.skill_tester import SkillTest

def test_runner(skill, example, emitter, loader):

    # Get the skill object from the skill path
    s = [s for s in loader.skills if s and s.root_dir == skill]

    # replace the best service with a mock
    s[0].best = MagicMock()
    # Set a valid return value for get_top_five
    s[0].best.get_top_five.return_value = {
        'one': 'http://example.com',
        'two': 'http://example.com',
        'three': 'http://example.com',
        'four': 'http://example.com',
        'five': 'http://example.com'
    }

    # Set a valid return value for get_best
    s[0].best.get_best.return_value = "http://example.com"

    return SkillTest(skill, example, emitter).run(loader)

Now the best service api will be replaced with a mock. The mock gets configured to return a dictionary when get_top_five() is called and an url string when get_best() is called.

More detailed tests

The intent tests offer a couple of other capabilities for cases where there is no speech response from Mycroft. The intent.json file can contain instructions to check for certain adapt intents being triggered as well as specific messages on the message bus. The complete list of options can be found here.

If we modify our skill to not give a speech response and only start playing the audio:

    @intent_handler(IntentBuilder('').require('Play').require('Best') \
                    .require('Commodore'))
    def handle_best(self, message):
        # Log in to service if needed
        if not self.best.logged_in:
            self.best.login(self.settings.get('user', ''),
                            self.settings.get('password', ''))

        self.audio_service.play(self.best.get_best())

    @intent_handler(IntentBuilder('').require('Play').require('All') \
                    .require('Commodore'))
    def handle_all(self, message):
        # Log in to service if needed
        if not self.best.logged_in:
            self.best.login(self.settings.get('user', ''),
                            self.settings.get('password', ''))
        # Make a list of all track urls
        tracks = list(self.best.get_top_five().values())
        self.audio_service.play(tracks)

our previous tests will fail. we now have two options. We can check that the adapt intent gets triggered and we can check that a message is sent to the audio system to start playing a track.

This is a check that adapt triggers the correct intent for the utterance "play all commodore 64 remixes":

{
  "utterance": "play the top five commodore remixes",
  "intent_type": "handle_all",
  "intent": {
    "All": "top 5",
    "Commodore": "commodore"
  }
}

This will match the intent type "handle_all" and it expects that the keyword All should be identified as "top 5" and the keyword Commodore is identified as "commodore".

the intent type handle_all comes from the handler method:

    @intent_handler(IntentBuilder('').require('Play').require('All') \
                    .require('Commodore'))
    def handle_all(self, message):

To check that the handle_best intent starts playback an intent test file can be written

{
  "utterance": "play the best commodore remix",
  "expected_data": {
    "__type__": "mycroft.audio.service.play",
  }
}
Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.