Audio-Visual-Cooking-Assistant

Table of Contents

About The Project
- Built With
Getting Started
- Prerequisites
Development
Production
Data for Spoken Language Understanding
Recommendations
License
Citation

About The Project

This prototype visualizes an example of a fully implemented interface within a multi-function food processor.It is based on the user-flow shown in the following figure.

Although our prototype does not cover all functions that a real multi-function food processor can offer, it provides a generic solution to research examples for multimodal interaction throughvoice & touch interaction.

The prototype is adaptable to different insights and recommendations, allowing for further research. The software architecture is seen in the following figure.

This prototype is meant to run locally or on a server like AWS. To run it locally please refer to the section Development.

Disclaimer: If you use the code or dataset please cite our work:

VoiceCookingAssistant. 2021. Audio-Visual-Cooking-Assistant. https://github.com/VoiceCookingAssistant/Audio-Visual-Cooking-Assistant

Built With

Inititally the frontend application is built with svelte and has a node server which serves as a middleware to the Rhasspy instance.

Getting started

Prerequisites

You need to have node and npm installed. To run it on your machine we propose to have docker and docker-compose installed.

Development

1. Start the Application:

Install the dependencies...

#from root
cd frontend
npm install

#from root
cd server
npm install

Start with docker (Recommended):

docker-compose build
docker-compose up

Navigate to localhost:5000. You see your app running. Edit a component file in src, save it, and reload the page to see your changes.

Start without docker (Not recommended):

#from root
cd frontend
npm run dev

#from root
cd server
npm run start-dev

Navigate to localhost:5000. You should see your app running. Edit a component file in src, save it, and reload the page to see your changes.

2. Connect Application with local Rhasspy Environment

To run a local Rhasspy Envrionemnt you need to have docker and docker-compose installed.

cd rhasspy
docker-compose up

Navigate to localhost:12101/. You see the environment running

The first time you have to adjust the Rhasspy settings in the UI environment:

Click the Home Button
Go to Advanced
Copy the file rhasspy/profile.json from this repo in it and click "Save Profile".
Click on the "Sentences-Menu-Icon" in the left Menu Bar
Copy the file rhasspy/template.ini from this repo in it and click "Save Sentences"

Dataset in the file rhasspy/template.ini is provided under a “CC BY 4.0”.
Click "Okay" in the Retrain Rhasspy Alert

Now you can test the prototype. For further adaption of the Rhasspy environment please refer to the official doumentation of Rhasspy.

Production

To create an optimised version of the app:

    docker-compose -f docker-compose.yml build
    docker-compose -f docker-compose.yml up

This version expects an .env file in root directory with follwing content:

    #.env
    PORT=3000
    HOST=0.0.0.0
    MQTTHOST=<YOUR_RHASSPY_HOST_IP>
    RHASSPY_PORT=12183

Navigate to localhost:5000. You see your app running.

Dataset for Spoken Language Understanding

You can apply our large amounts of in-domain dataset for your spoken language understanding research.

Training Dataset (1964 queries with 10724 running words): rhasspy/NLU/trainset.md
Test Dataset (839 queries with 4507 running words): rhasspy/NLU/testset.md

Training Dataset and Testing Dataset are provided under a “CC BY 4.0”.

Recommendations

If you're using Visual Studio Code we recommend installing the official extension Svelte for VS Code. If you are using other editors you may need to install a plugin in order to get syntax highlighting and intellisense.

This prototype was tested in the Browser Google Chrome .

License

Distributed under the Apache 2.0 License. See LICENSE for more information.

Citation

If you use or build on our work, please cite our paper related to this project:

@inproceedings{kendrick-etal-2021-audio,
    title = "Audio-Visual Recipe Guidance for Smart Kitchen Devices",
    author = "Kendrick, Caroline  and
      Frohnmaier, Mariano  and
      Georges, Munir",
    booktitle = "Proceedings of The Fourth International Conference on Natural Language and Speech Processing (ICNLSP 2021)",
    month = "12--13 " # nov,
    year = "2021",
    address = "Trento, Italy",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.icnlsp-1.30",
    pages = "257--261",
}

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
frontend		frontend
rhasspy		rhasspy
server		server
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
Table_Of_Intents_and_Entities.md		Table_Of_Intents_and_Entities.md
default		default
docker-compose.override.yml		docker-compose.override.yml
docker-compose.yml		docker-compose.yml
visual_screen_design.md		visual_screen_design.md
workspace.code-workspace		workspace.code-workspace

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio-Visual-Cooking-Assistant

About The Project

Built With

Getting started

Prerequisites

Development

1. Start the Application:

2. Connect Application with local Rhasspy Environment

Production

Dataset for Spoken Language Understanding

Recommendations

License

Citation

About

Releases

Packages

Contributors 2

Languages

License

VoiceCookingAssistant/Audio-Visual-Cooking-Assistant

Folders and files

Latest commit

History

Repository files navigation

Audio-Visual-Cooking-Assistant

About The Project

Built With

Getting started

Prerequisites

Development

1. Start the Application:

2. Connect Application with local Rhasspy Environment

Production

Dataset for Spoken Language Understanding

Recommendations

License

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages