Skip to content

Conversation

@iriark01
Copy link
Contributor

Ready for engineering review.

Please look at the source for commented-out questions.

You can also see a preview (on the VPN): https://os.mbed.com/docs/mbed-os/TensforFlow_edit/mbed-os-pelion/machine-learning-with-tensorflow-and-mbed-os.html

* Git.
**Getting started**
* xxd.<!--unless is exists by default on all Linux computers?-->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, you are right, it does come with Ubuntu although it could be the case some minimalist distros don't include it for some reason. But since the tutorial focuses on Ubuntu we could leave this out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* xxd.<!--unless is exists by default on all Linux computers?-->

<span class="notes">**Note**: On a Mac, you might have to use gmake instead of make.</span>
[Watch here](http://www.youtube.com/watch?v=x5MhydijWmc)
* Python 2.7. We recommend [using pyenv to manage Python versions](https://pypi.org/project/pyenv/).<!--The install page for Mbed CLI now requires 3.7. Why are we asking for 2.7?-->
Copy link
Contributor

@spartacoos spartacoos Sep 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We tried with Python 3.7 but had issues from the TensorFlow side. See point 5 here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And that doesn't cause problems with Mbed CLI?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python 3.x please! What are the issues on the TF-side (apart from outdated tutorial)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, just tested again with a Python 3.7 environment and it worked fine.

The code samples audio from the microphone on the K66F. The audio is run through a Fast Fourier transform to create a spectrogram. The spectrogram is then fed into a pre-trained machine learning model. The model uses a [convolutional neural network](https://developers.google.com/machine-learning/practica/image-classification/convolutional-neural-networks) to identify whether the sample represents either the command “yes” or “no”, silence, or an unknown input. We will explore how this works in more detail later in the guide.
<!--The install page for Mbed CLI now requires 3.7. Why are we asking for 2.7?-->
<!--need to explain what `config root .` and `deploy` do and why we need them - neither one is part of a standard workflow where you use Mbed CLI to import an application, so this is a special case-->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand, mbed config root . sets the current mbed-os folder as the root (meaning mbed command actually knows what Mbed OS project we want to run the following commands on) which then allows us to run mbed deploy which can pull the libraries needed (in our case in tensorflow/tensorflow/lite/micro/tools/make/gen/mbed_cortex-m4/prj/micro_speech/mbed/mbed-os/requirements.txt) and installs them. I was not aware that this is not a standard workflow, this is the workflow presented in the original guide hosted on developer.arm.com. If you know of a better way of doing this I'd be glad to explore that workflow and update the guide.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@donatieng can you confirm/deny?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above, I think this is a shortcoming of the Makefile here

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we good here?

The code samples audio from the microphone on the K66F. The audio is run through a Fast Fourier transform to create a spectrogram. The spectrogram is then fed into a pre-trained machine learning model. The model uses a [convolutional neural network](https://developers.google.com/machine-learning/practica/image-classification/convolutional-neural-networks) to identify whether the sample represents either the command “yes” or “no”, silence, or an unknown input. We will explore how this works in more detail later in the guide.
<!--The install page for Mbed CLI now requires 3.7. Why are we asking for 2.7?-->
<!--need to explain what `config root .` and `deploy` do and why we need them - neither one is part of a standard workflow where you use Mbed CLI to import an application, so this is a special case-->
<!--and why are we compiling here? We compile again two steps down, with the flash parameter-->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. This is the format presented in the original guide referenced above. However, we could take that out. One reason could be that we compile to ensure everything is working and only then we compile + flash the device.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first compile step now has to come out, since it compiled with an old version of Mbed OS.

<!--and why are we compiling here? We compile again two steps down, with the flash parameter-->
Here are descriptions of some interesting source files:
For some compilers<!--we only support two, and you specifically asked for GNU, so in what case will this happen?-->, you may get a compilation error in `mbed_rtc_time.cpp`. Go to `mbed-os/platform/mbed_rtc_time.h` and comment out line 32 and line 37:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was coming from the TensorFlow instructions at some point while we were developing the demo. However, I can't seem to find this warning anymore on their repo other than in the K66F instructions which I pushed myself but I just tried this workflow again and it works without commenting anything out so they must have fixed this issue. We can omit this step.

- [command_responder.cc](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/micro/examples/micro_speech/command_responder.cc) is called every time a potential command has been identified.
1. Flash the application to the board:
<!--I compiled two steps ago - that probably needs to be removed-->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we could either remove the previous compile step or this one to be less redundant. I'd say it might be better to keep this one as it flashes automatically to the board.

```
mbed compile -m K66F -t GCC_ARM –flash
```
1. Deploy the example to your K66F<!--but we flshed already. What are we doing here?-->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, following the format in the original guide on developer.arm.com. They had two ways of doing things. One where we compile and flash on the same command mbed compile -m K66F -t GCC_ARM –flash and one where we just compile then copy the binary over to the board to flash it, as shown below with cp ./BUILD/K66F/GCC_ARM/mbed.bin /Volumes/K66F/ .

- [main.cc](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/micro/examples/micro_speech/main.cc). This file is the entry point for the Mbed program, which runs the machine learning model using TensorFlow Lite for Microcontrollers.
Copy the binary file that we built earlier to the USB storage.
Note: if you have skipped the previous steps<!--which ones?-->, download the [binary file]() to proceed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all the steps up to compiling steps

cp ./BUILD/K66F/GCC_ARM/mbed.bin /Volumes/K66F/
```
Depending on your operating system <!--aren't we making everyone use Linux?-->, the exact copy command and paths may vary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I should have removed this from the original guide.

@iriark01
Copy link
Contributor Author

iriark01 commented Oct 1, 2020

On hold until it's updated to Mbed OS 6

The model was trained on the one-second samples of audio we saw above. In the training data, the word “yes” or “no” is spoken at the start of the sample, and the entire word is contained within that one-second. However, when the application is running, there is no guarantee that a user will begin speaking at the very beginning of our one-second sample. If the user starts saying “yes” at the end of the sample instead of the beginning, the model might not be able to understand the word. This is because the model uses the position of the features within the sample to help predict which word was spoken.
# Deploy the example to your K66F
To solve this problem, our code runs inference as often as it can<!--here you lost me, probably because I don't deal with ML - is this in the training again? where is it getting new samples?-->, depending on the speed of the device, and averages all of the results within a rolling 1000ms window. The code in the file [recognize_commands.cc](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/micro/examples/micro_speech/recognize_commands.cc) performs this action. When the average for a given category in a set of predictions goes above the threshold<!--lost me here too - are we talking about training or recognising actual users?-->, as defined in [recognize_commands.h](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/micro/examples/micro_speech/recognize_commands.h), we can assume a valid result.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part doesn't refer to training data but rather samples when the model is running on the device. Because we don't know at what point in the "window" a person will say the keyword this means we need to run inference many times so that we effectively take many snapshots of many windows then we analyse those "windows" and then take the average result as that's what is most likely being said. For a deeper explanation you can search "Windowing" on this book on TinyML.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To solve this problem, our code runs inference as often as it can<!--here you lost me, probably because I don't deal with ML - is this in the training again? where is it getting new samples?-->, depending on the speed of the device, and averages all of the results within a rolling 1000ms window. The code in the file [recognize_commands.cc](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/micro/examples/micro_speech/recognize_commands.cc) performs this action. When the average for a given category in a set of predictions goes above the threshold<!--lost me here too - are we talking about training or recognising actual users?-->, as defined in [recognize_commands.h](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/micro/examples/micro_speech/recognize_commands.h), we can assume a valid result.
To solve this problem, our code runs inference as often as it can, depending on the speed of the device, and averages all of the results within a rolling 1000ms window. The code in the file [recognize_commands.cc](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/micro/examples/micro_speech/recognize_commands.cc) performs this action. When the average for a given category in a set of predictions goes above the threshold, as defined in [recognize_commands.h](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/micro/examples/micro_speech/recognize_commands.h), we can assume a valid result.

1. Save the file.
- Modify the code and deploy to the device.
1. Copy the `tiny_conv_micro_features_model_data.cc` <!--where is this file? it's not the one I was just editing-->file into the `tensorflow/tensorflow/lite/micro/tools/make/gen/mbed_cortex-m4/prj/micro_speech/mbed/tensorflow/lite/micro/examples/micro_speech/micro_features` directory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies. This file was originally called tiny_conv_micro_features_model_data.cc when the project was first made but this is now just micro_features/model.cc .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Copy the `tiny_conv_micro_features_model_data.cc` <!--where is this file? it's not the one I was just editing-->file into the `tensorflow/tensorflow/lite/micro/tools/make/gen/mbed_cortex-m4/prj/micro_speech/mbed/tensorflow/lite/micro/examples/micro_speech/micro_features` directory.
1. Copy the `model.cc` file into the `tensorflow/tensorflow/lite/micro/tools/make/gen/mbed_cortex-m4/prj/micro_speech/mbed/tensorflow/lite/micro/examples/micro_speech/micro_features` directory.

```
To save time, we will skip this step and instead download the [tiny_conv.tflite](https://developer.arm.com/-/media/Files/downloads/Machine%20learning%20how-to%20guides/tiny_conv.tflite?revision=495eb362-4325-49b8-b3ba-3141df0c9b95&la=en&hash=0F37BA2C5DE95A1561979CDD18973171767A47C3).
The code uses this array to map the output of the model to the correct value. Because we specified our wanted_words<!--is that underscore correct?--> as “up, down” in the training script, we should update this array to reflect these words in the same order.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the underscore is correct but it should actually be all caps "WANTED_WORDS" as on the python training script.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The code uses this array to map the output of the model to the correct value. Because we specified our wanted_words<!--is that underscore correct?--> as “up, down” in the training script, we should update this array to reflect these words in the same order.
The code uses this array to map the output of the model to the correct value. Because we specified our WANTED_WORDS as “up, down” in the training script, we should update this array to reflect these words in the same order.

```
and open the file:
1. Copy the binary to the USB storage of the device<!--why aren't we using flash?-->, using the same method that you used earlier.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can leave this out and show it as before. I'm not sure what the logic behind showing two ways of doing it was when the original guide was made.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Copy the binary to the USB storage of the device<!--why aren't we using flash?-->, using the same method that you used earlier.

Copy link
Contributor

@spartacoos spartacoos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if I missed any of your questions or if you need me to look at some of the questions in more detail such as why we are flashing in two different ways. I can discuss with Alessandro Grande from Arm who made the original guide with Pete Warden from Google.

Copy link
Contributor

@spartacoos spartacoos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this covers your comments plus the porting to Mbed OS 6.3. Feel free to ping me on Slack if anything else needs changing.

@spartacoos spartacoos closed this Oct 13, 2020
@iriark01 iriark01 reopened this Oct 13, 2020
Copy link
Contributor Author

@iriark01 iriark01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to approve some changes I've made, and I would like @donatieng to weigh in on some things.

The code samples audio from the microphone on the K66F. The audio is run through a Fast Fourier transform to create a spectrogram. The spectrogram is then fed into a pre-trained machine learning model. The model uses a [convolutional neural network](https://developers.google.com/machine-learning/practica/image-classification/convolutional-neural-networks) to identify whether the sample represents either the command “yes” or “no”, silence, or an unknown input. We will explore how this works in more detail later in the guide.
<!--The install page for Mbed CLI now requires 3.7. Why are we asking for 2.7?-->
<!--need to explain what `config root .` and `deploy` do and why we need them - neither one is part of a standard workflow where you use Mbed CLI to import an application, so this is a special case-->
<!--and why are we compiling here? We compile again two steps down, with the flash parameter-->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first compile step now has to come out, since it compiled with an old version of Mbed OS.

* Git.
**Getting started**
* xxd.<!--unless is exists by default on all Linux computers?-->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* xxd.<!--unless is exists by default on all Linux computers?-->

The model was trained on the one-second samples of audio we saw above. In the training data, the word “yes” or “no” is spoken at the start of the sample, and the entire word is contained within that one-second. However, when the application is running, there is no guarantee that a user will begin speaking at the very beginning of our one-second sample. If the user starts saying “yes” at the end of the sample instead of the beginning, the model might not be able to understand the word. This is because the model uses the position of the features within the sample to help predict which word was spoken.
# Deploy the example to your K66F
To solve this problem, our code runs inference as often as it can<!--here you lost me, probably because I don't deal with ML - is this in the training again? where is it getting new samples?-->, depending on the speed of the device, and averages all of the results within a rolling 1000ms window. The code in the file [recognize_commands.cc](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/micro/examples/micro_speech/recognize_commands.cc) performs this action. When the average for a given category in a set of predictions goes above the threshold<!--lost me here too - are we talking about training or recognising actual users?-->, as defined in [recognize_commands.h](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/micro/examples/micro_speech/recognize_commands.h), we can assume a valid result.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To solve this problem, our code runs inference as often as it can<!--here you lost me, probably because I don't deal with ML - is this in the training again? where is it getting new samples?-->, depending on the speed of the device, and averages all of the results within a rolling 1000ms window. The code in the file [recognize_commands.cc](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/micro/examples/micro_speech/recognize_commands.cc) performs this action. When the average for a given category in a set of predictions goes above the threshold<!--lost me here too - are we talking about training or recognising actual users?-->, as defined in [recognize_commands.h](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/micro/examples/micro_speech/recognize_commands.h), we can assume a valid result.
To solve this problem, our code runs inference as often as it can, depending on the speed of the device, and averages all of the results within a rolling 1000ms window. The code in the file [recognize_commands.cc](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/micro/examples/micro_speech/recognize_commands.cc) performs this action. When the average for a given category in a set of predictions goes above the threshold, as defined in [recognize_commands.h](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/micro/examples/micro_speech/recognize_commands.h), we can assume a valid result.

1. Save the file.
- Modify the code and deploy to the device.
1. Copy the `tiny_conv_micro_features_model_data.cc` <!--where is this file? it's not the one I was just editing-->file into the `tensorflow/tensorflow/lite/micro/tools/make/gen/mbed_cortex-m4/prj/micro_speech/mbed/tensorflow/lite/micro/examples/micro_speech/micro_features` directory.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Copy the `tiny_conv_micro_features_model_data.cc` <!--where is this file? it's not the one I was just editing-->file into the `tensorflow/tensorflow/lite/micro/tools/make/gen/mbed_cortex-m4/prj/micro_speech/mbed/tensorflow/lite/micro/examples/micro_speech/micro_features` directory.
1. Copy the `model.cc` file into the `tensorflow/tensorflow/lite/micro/tools/make/gen/mbed_cortex-m4/prj/micro_speech/mbed/tensorflow/lite/micro/examples/micro_speech/micro_features` directory.

```
To save time, we will skip this step and instead download the [tiny_conv.tflite](https://developer.arm.com/-/media/Files/downloads/Machine%20learning%20how-to%20guides/tiny_conv.tflite?revision=495eb362-4325-49b8-b3ba-3141df0c9b95&la=en&hash=0F37BA2C5DE95A1561979CDD18973171767A47C3).
The code uses this array to map the output of the model to the correct value. Because we specified our wanted_words<!--is that underscore correct?--> as “up, down” in the training script, we should update this array to reflect these words in the same order.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The code uses this array to map the output of the model to the correct value. Because we specified our wanted_words<!--is that underscore correct?--> as “up, down” in the training script, we should update this array to reflect these words in the same order.
The code uses this array to map the output of the model to the correct value. Because we specified our WANTED_WORDS as “up, down” in the training script, we should update this array to reflect these words in the same order.

```
and open the file:
1. Copy the binary to the USB storage of the device<!--why aren't we using flash?-->, using the same method that you used earlier.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Copy the binary to the USB storage of the device<!--why aren't we using flash?-->, using the same method that you used earlier.

<span class="notes">**Note**: On a Mac, you might have to use gmake instead of make.</span>
[Watch here](http://www.youtube.com/watch?v=x5MhydijWmc)
* Python 2.7. We recommend [using pyenv to manage Python versions](https://pypi.org/project/pyenv/).<!--The install page for Mbed CLI now requires 3.7. Why are we asking for 2.7?-->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And that doesn't cause problems with Mbed CLI?

The code samples audio from the microphone on the K66F. The audio is run through a Fast Fourier transform to create a spectrogram. The spectrogram is then fed into a pre-trained machine learning model. The model uses a [convolutional neural network](https://developers.google.com/machine-learning/practica/image-classification/convolutional-neural-networks) to identify whether the sample represents either the command “yes” or “no”, silence, or an unknown input. We will explore how this works in more detail later in the guide.
<!--The install page for Mbed CLI now requires 3.7. Why are we asking for 2.7?-->
<!--need to explain what `config root .` and `deploy` do and why we need them - neither one is part of a standard workflow where you use Mbed CLI to import an application, so this is a special case-->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@donatieng can you confirm/deny?

Copy link
Contributor

@donatieng donatieng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @COTASPAR - some feedback below.

<span class="notes">**Note**: On a Mac, you might have to use gmake instead of make.</span>
[Watch here](http://www.youtube.com/watch?v=x5MhydijWmc)
* Python 2.7. We recommend [using pyenv to manage Python versions](https://pypi.org/project/pyenv/).<!--The install page for Mbed CLI now requires 3.7. Why are we asking for 2.7?-->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python 3.x please! What are the issues on the TF-side (apart from outdated tutorial)?

cd tensorflow/tensorflow/lite/micro/mbed/
```
1. Change the DebugLog function to use UnbufferedSerial instead since the Serial API was deprecated in Mbed OS 6:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we upstream this? Or link an open PR in the doc?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many of the other TFLu examples still use older versions of Mbed OS and thus if we upstream it that could break those examples. Ideally, all of the examples would be ported to the latest version of mbed. However, I think for now this should be okay while perhaps we try to engage with the TFLu team to port the other examples.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As soon as this is published we will start to work with the TF team to discuss next steps.

# Download and build the sample application
```
mbed config root .
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a shortcoming of the TFLiteU Makefile above, ideally this wouldn't be needed so candidate for upstreaming too.

Copy link
Contributor

@spartacoos spartacoos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented the changes from the comments.

Copy link
Contributor

@donatieng donatieng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a bit of clean-up but it looks good, thanks a lot for the work @COTASPAR !

@donatieng donatieng merged commit ccf5a54 into TensorFlow Nov 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants