Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TFLM: Add new Cortex M target for running on a FVP #46830

Merged
merged 7 commits into from
Feb 16, 2021

Conversation

mansnils
Copy link
Contributor

@mansnils mansnils commented Feb 1, 2021

  • Adds new target for running on a fixed virtual platform based on Arm
    Corstone-300 software.
  • Adds test script for running with FVP.
  • Adds new download scripts.
  • Adds new CI script.
  • Adds readme file.

This is fixing: #46829

The CI script takes about 5 minutes to run for me.

* Adds new target for running on a fixed virtual platform based on Arm
  Corstone-300 software.
* Adds test script for running with FVP.
* Adds new download scripts.
* Adds new CI script.
* Adds readme file.
@google-ml-butler google-ml-butler bot added the size:L CL Change Size: Large label Feb 1, 2021
@google-ml-butler
Copy link

Thanks for contributing to TensorFlow Lite Micro.

To keep this process moving along, we'd like to make sure that you have completed the items on this list:

We would like to have a discussion on the Github issue first to determine the best path forward, and then proceed to the PR review.

@google-cla google-cla bot added the cla: yes label Feb 1, 2021
@advaitjain advaitjain self-requested a review February 1, 2021 16:17
@advaitjain advaitjain added comp:micro:arm comp:micro Related to TensorFlow Lite Microcontrollers labels Feb 1, 2021
@gbaned gbaned self-assigned this Feb 1, 2021
@gbaned gbaned added this to Assigned Reviewer in PR Queue via automation Feb 1, 2021
@advaitjain advaitjain added the kokoro:force-run Tests on submitted change label Feb 3, 2021
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Feb 3, 2021
Copy link
Member

@advaitjain advaitjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try and give some concrete suggestions for micro_test.h tomorrow.

Also, can you add to the PR description how much additional time you expect this to add to the CI? We already take quite long (20+ minutes) and I at least want to have any additions be documented.

PR Queue automation moved this from Assigned Reviewer to Reviewer Requested Changes Feb 3, 2021
@mansnils
Copy link
Contributor Author

mansnils commented Feb 3, 2021

@advaitjain To speed up the time for the CI script we could just test the CMSIS-NN kernels. Then it took 1 minute and 44 seconds

Copy link
Member

@advaitjain advaitjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 mins is a lot to add to each CI run (we are currently at 20mins). Let's not add to test_all.sh in this PR.

Once all the changes are merged, we can figure out what a suitable approach might be. We can also test on the CI servers and see what the time increase is on there.

Running a smaller set of tests, or making it a continuous build instead of at every presubmit might be more reasonable. 2 mins might be ok, but we start to go down a slippery slope that I'd like to avoid.

As another data-point, the xtensa builds also take around 4 mins and we have them as continuous builds (3 times a day) which we have found sufficient.

tensorflow/lite/micro/testing/micro_test.h Outdated Show resolved Hide resolved
tensorflow/lite/micro/cortex_m_corstone_300/README.md Outdated Show resolved Hide resolved
CCFLAGS += -D$(ARM_CPU)$(CMSIS_ARM_FEATURES)

THIRD_PARTY_CC_SRCS += \
$(wildcard $(ETHOS_U_CORE_PLATFORM)/*.c)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it better to be explicit what files we include from core platform?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any changes for this comment. Just flagging in case you missed some changes when adding commits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the changes at lines 135-137

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused. I see

THIRD_PARTY_CC_SRCS += \
  $(wildcard $(ETHOS_U_CORE_PLATFORM)/*.c)

and

THIRD_PARTY_CC_SRCS += \
  $(ETHOS_U_CORE_PLATFORM)/retarget.c \
  $(ETHOS_U_CORE_PLATFORM)/uart.c`

is this intentional?

@mansnils
Copy link
Contributor Author

mansnils commented Feb 5, 2021

I was able to get the time down to 3 minutes and 14 seconds by changing the FVP cpulimit to 1. With only CMSIS-NN kernels it was 74 seconds.
One option is that we only run CMSIS-NN kernels for both this target as well as stm32f4. Bluepill already runs all unit tests and we are primarily interested in the CMSIS-NN kernels.
I counted the stm32f4 CI script to take 149 seconds and with only CMSIS-NN kernels and debug build it took 105 seconds.
That means the total time would increase with only 30 seconds.

Copy link
Member

@advaitjain advaitjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. only minor nits and there's a merge conflict due to #46904

We can get this merged and then look at what is involved in turning on this target on CI separately.

#ifndef TENSORFLOW_LITE_MICRO_TESTING_SYSTEM_SETUP_H_
#define TENSORFLOW_LITE_MICRO_TESTING_SYSTEM_SETUP_H_

namespace system_setup {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we're going with a flat tflite namespace and descriptive function names.

If you'd prefer we can go with tflite::InitializeTarget instead of tflite::Initialize.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here's some more context on this decision: https://abseil.io/tips/130

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure


MICRO_LITE_BENCHMARKS := $(filter-out tensorflow/lite/micro/benchmarks/Makefile.inc, $(MICRO_LITE_BENCHMARKS))

EXCLUDED_TESTS := \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make a bug for these excluded tests and add a TODO(#) here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, just adding a TODO then without a reference I assume

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for not being explicit in my earlier comment.

I was referring to making a github issue and linking to that from here, similar to:

// TODO(#46937): Until we resolve the global variable issue with Renode, we
// will be creating a new ErrorReporter object each time. While this is
// inefficient, it still allows us to make progress.
error_reporter_ = new (micro_error_reporter_buffer) MicroErrorReporter();

In this case the github issue should give the error message and exact commands and we can add a note here that we are excluding to get the first integration merged and will come back and fix the issue later.

-I$(CMSIS_PATH)/Device/ARM/$(ARM_CPU)/Include \
-I$(CMSIS_PATH)/CMSIS/Core/Include

MICRO_LITE_BENCHMARKS := $(filter-out tensorflow/lite/micro/benchmarks/Makefile.inc, $(MICRO_LITE_BENCHMARKS))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a bug for the excluded benchmark?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to the excluded tests, let's make a github issue and add a TODO(#) comment here.

@advaitjain
Copy link
Member

I was able to get the time down to 3 minutes and 14 seconds by changing the FVP cpulimit to 1. With only CMSIS-NN kernels it was 74 seconds.
One option is that we only run CMSIS-NN kernels for both this target as well as stm32f4. Bluepill already runs all unit tests and we are primarily interested in the CMSIS-NN kernels.
I counted the stm32f4 CI script to take 149 seconds and with only CMSIS-NN kernels and debug build it took 105 seconds.
That means the total time would increase with only 30 seconds.

Good info, thanks!

Filtering for just the CMSIS kernels might be more maintenance overhead than we want to take on. We do have the ability to run more than a single CI job (which will soon become easier), but let's decide what we want to do after this PR is merged.

Copy link
Member

@advaitjain advaitjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for the back and forth, I should have been more explicit in my previous review.

asking for a few more clarifications here.

CCFLAGS += -D$(ARM_CPU)$(CMSIS_ARM_FEATURES)

THIRD_PARTY_CC_SRCS += \
$(wildcard $(ETHOS_U_CORE_PLATFORM)/*.c)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused. I see

THIRD_PARTY_CC_SRCS += \
  $(wildcard $(ETHOS_U_CORE_PLATFORM)/*.c)

and

THIRD_PARTY_CC_SRCS += \
  $(ETHOS_U_CORE_PLATFORM)/retarget.c \
  $(ETHOS_U_CORE_PLATFORM)/uart.c`

is this intentional?


MICRO_LITE_BENCHMARKS := $(filter-out tensorflow/lite/micro/benchmarks/Makefile.inc, $(MICRO_LITE_BENCHMARKS))

EXCLUDED_TESTS := \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for not being explicit in my earlier comment.

I was referring to making a github issue and linking to that from here, similar to:

// TODO(#46937): Until we resolve the global variable issue with Renode, we
// will be creating a new ErrorReporter object each time. While this is
// inefficient, it still allows us to make progress.
error_reporter_ = new (micro_error_reporter_buffer) MicroErrorReporter();

In this case the github issue should give the error message and exact commands and we can add a note here that we are excluding to get the first integration merged and will come back and fix the issue later.

-I$(CMSIS_PATH)/Device/ARM/$(ARM_CPU)/Include \
-I$(CMSIS_PATH)/CMSIS/Core/Include

MICRO_LITE_BENCHMARKS := $(filter-out tensorflow/lite/micro/benchmarks/Makefile.inc, $(MICRO_LITE_BENCHMARKS))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to the excluded tests, let's make a github issue and add a TODO(#) comment here.

@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Feb 10, 2021
PR Queue automation moved this from Reviewer Requested Changes to Approved by Reviewer Feb 10, 2021
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Feb 10, 2021
@advaitjain
Copy link
Member

didn't see earlier that the bazel build is failing (as part of the TfLite Micro CI).

You'll have to add a new target for system_setup, similar to:

cc_library(
name = "debug_log",
srcs = [
"debug_log.cc",
],
hdrs = [
"debug_log.h",
],
copts = micro_copts(),
)

and add that as a dep to the tests. There's a chance that all the tests will need this additional dependency, which would be very unfortunate. I'll take a look and get back to you.

advaitjain added a commit to advaitjain/tensorflow that referenced this pull request Feb 10, 2021
…target.

This will allow the unit tests to be run on additional targets that need
some addiitonal initialization (for example cornstone_300 from tensorflow#46830).

This particular change is broken out from the Cornstone PR tensorflow#46830 to
be able to have smaller more reviewable PRs.

Progress towards tensorflow#46829
@advaitjain
Copy link
Member

I have created #47077 to add InitializeTarget as a separate change (and deal with the bazel dependency issue separate from adding the Cornstone300 target)

advaitjain added a commit to advaitjain/tensorflow that referenced this pull request Feb 11, 2021
…target.

This will allow the unit tests to be run on additional targets that need
some addiitonal initialization (for example cornstone_300 from tensorflow#46830).

This particular change is broken out from the Cornstone PR tensorflow#46830 to
be able to have smaller more reviewable PRs.

Progress towards tensorflow#46829
advaitjain added a commit to advaitjain/tensorflow that referenced this pull request Feb 11, 2021
…target.

This will allow the unit tests to be run on additional targets that need
some addiitonal initialization (for example cornstone_300 from tensorflow#46830).

This particular change is broken out from the Cornstone PR tensorflow#46830 to
be able to have smaller more reviewable PRs.

In the past, we have added state to the DebugLog() and
GetCurrentTimeTicks() functions as a way to avoid having an
InitializeTarget function. With this change, we are deciding to go with
an explicit intitialization step instead.

This change has added calls to tflite::InitializeTarget to the tests,
benchmarks, and examples and converted the Arduino and SparkfunEdge to
make use of this explicit initialization.

The changes for the Arduino and SparkfunEdge have not been tested on
actual hardware.

Progress towards tensorflow#46829
advaitjain added a commit to advaitjain/tensorflow that referenced this pull request Feb 11, 2021
…target.

This will allow the unit tests to be run on additional targets that need
some addiitonal initialization (for example cornstone_300 from tensorflow#46830).

This particular change is broken out from the Cornstone PR tensorflow#46830 to
be able to have smaller more reviewable PRs.

In the past, we have added state to the DebugLog() and
GetCurrentTimeTicks() functions as a way to avoid having an
InitializeTarget function. With this change, we are deciding to go with
an explicit intitialization step instead.

This change has added calls to tflite::InitializeTarget to the tests,
benchmarks, and examples and converted the Arduino and SparkfunEdge to
make use of this explicit initialization.

The changes for the Arduino and SparkfunEdge have not been tested on
actual hardware.

Progress towards tensorflow#46829
advaitjain added a commit to advaitjain/tensorflow that referenced this pull request Feb 11, 2021
…target.

This will allow the unit tests to be run on additional targets that need
some addiitonal initialization (for example cornstone_300 from tensorflow#46830).

This particular change is broken out from the Cornstone PR tensorflow#46830 to
be able to have smaller more reviewable PRs.

In the past, we have added state to the DebugLog() and
GetCurrentTimeTicks() functions as a way to avoid having an
InitializeTarget function. With this change, we are deciding to go with
an explicit intitialization step instead.

This change has added calls to tflite::InitializeTarget to the tests,
benchmarks, and examples and converted the Arduino and SparkfunEdge to
make use of this explicit initialization.

The changes for the Arduino and SparkfunEdge have not been tested on
actual hardware.

Progress towards tensorflow#46829
advaitjain added a commit to advaitjain/tensorflow that referenced this pull request Feb 12, 2021
…target.

This will allow the unit tests to be run on additional targets that need
some addiitonal initialization (for example cornstone_300 from tensorflow#46830).

This particular change is broken out from the Cornstone PR tensorflow#46830 to
be able to have smaller more reviewable PRs.

In the past, we have added state to the DebugLog() and
GetCurrentTimeTicks() functions as a way to avoid having an
InitializeTarget function. With this change, we are deciding to go with
an explicit intitialization step instead.

This change has added calls to tflite::InitializeTarget to the tests,
benchmarks, and examples and converted the Arduino and SparkfunEdge to
make use of this explicit initialization.

The changes for the Arduino and SparkfunEdge have not been tested on
actual hardware.

Progress towards tensorflow#46829
@google-ml-butler google-ml-butler bot removed the ready to pull PR ready for merge process label Feb 12, 2021
@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Feb 12, 2021
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Feb 12, 2021
@google-ml-butler google-ml-butler bot added the kokoro:force-run Tests on submitted change label Feb 15, 2021
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Feb 15, 2021
@copybara-service copybara-service bot merged commit 614fdcc into tensorflow:master Feb 16, 2021
PR Queue automation moved this from Approved by Reviewer to Merged Feb 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes comp:micro Related to TensorFlow Lite Microcontrollers ready to pull PR ready for merge process size:L CL Change Size: Large
Projects
PR Queue
  
Merged
Development

Successfully merging this pull request may close these issues.

None yet

5 participants