diff --git a/community/en/docs/tfmobile/android_build.md b/community/en/docs/tfmobile/android_build.md
new file mode 100644
index 00000000000..cc43dc19708
--- /dev/null
+++ b/community/en/docs/tfmobile/android_build.md
@@ -0,0 +1,196 @@
+# Building TensorFlow on Android
+
+Warning: TensorFlow Mobile is __deprecated__.
+
+
+
+ TensorFlow Lite is our main
+ mobile and embedded offering. We are
+ working hard to close the feature gap between TensorFlow Mobile and
+ TensorFlow Lite. We expect to deprecate TensorFlow Mobile in early 2019. We
+ will give ample notice to our users when we get to that point and will
+ provide help and support to ensure easy migrations.
+
+
+ In the meantime, please use TensorFlow Lite. If you have a feature request,
+ such as a missing op, please post to our GitHub.
+
+
+
+To get you started working with TensorFlow on Android, we'll walk through two
+ways to build our TensorFlow mobile demos and deploying them on an Android
+device. The first is Android Studio, which lets you build and deploy in an
+IDE. The second is building with Bazel and deploying with ADB on the command
+line.
+
+Why choose one or the other of these methods?
+
+The simplest way to use TensorFlow on Android is to use Android Studio. If you
+aren't planning to customize your TensorFlow build at all, or if you want to use
+Android Studio's editor and other features to build an app and just want to add
+TensorFlow to it, we recommend using Android Studio.
+
+If you are using custom ops, or have some other reason to build TensorFlow from
+scratch, scroll down and see our instructions
+for building the demo with Bazel.
+
+## Build the demo using Android Studio
+
+**Prerequisites**
+
+If you haven't already, do the following two things:
+
+- Install [Android Studio](https://developer.android.com/studio/index.html),
+ following the instructions on their website.
+
+- Clone the TensorFlow repository from GitHub:
+
+ git clone https://github.com/tensorflow/tensorflow
+
+**Building**
+
+1. Open Android Studio, and from the Welcome screen, select **Open an existing
+ Android Studio project**.
+
+2. From the **Open File or Project** window that appears, navigate to and select
+ the `tensorflow/examples/android` directory from wherever you cloned the
+ TensorFlow GitHub repo. Click OK.
+
+ If it asks you to do a Gradle Sync, click OK.
+
+ You may also need to install various platforms and tools, if you get
+ errors like "Failed to find target with hash string 'android-23' and similar.
+
+3. Open the `build.gradle` file (you can go to **1:Project** in the side panel
+ and find it under the **Gradle Scripts** zippy under **Android**). Look for
+ the `nativeBuildSystem` variable and set it to `none` if it isn't already:
+
+ // set to 'bazel', 'cmake', 'makefile', 'none'
+ def nativeBuildSystem = 'none'
+
+4. Click the *Run* button (the green arrow) or select *Run > Run 'android'* from the
+ top menu. You may need to rebuild the project using *Build > Rebuild Project*.
+
+ If it asks you to use Instant Run, click **Proceed Without Instant Run**.
+
+ Also, you need to have an Android device plugged in with developer options
+ enabled at this
+ point. See [here](https://developer.android.com/studio/run/device.html) for
+ more details on setting up developer devices.
+
+This installs three apps on your phone that are all part of the TensorFlow
+Demo. See [Android Sample Apps](#android_sample_apps) for more information about
+them.
+
+## Adding TensorFlow to your apps using Android Studio
+
+To add TensorFlow to your own apps on Android, the simplest way is to add the
+following lines to your Gradle build file:
+
+ allprojects {
+ repositories {
+ jcenter()
+ }
+ }
+
+ dependencies {
+ implementation 'org.tensorflow:tensorflow-android:+'
+ }
+
+This automatically downloads the latest stable version of TensorFlow as an AAR
+and installs it in your project.
+
+## Build the demo using Bazel
+
+Another way to use TensorFlow on Android is to build an APK
+using Bazel and load it onto your device
+using [ADB](https://developer.android.com/studio/command-line/adb.html). This
+requires some knowledge of build systems and Android developer tools, but we'll
+guide you through the basics here.
+
+- First, follow our instructions for
+ installing from sources.
+ This will also guide you through installing Bazel and cloning the
+ TensorFlow code.
+
+- Download the Android [SDK](https://developer.android.com/studio/index.html)
+ and [NDK](https://developer.android.com/ndk/downloads/index.html) if you do
+ not already have them. You need at least version 12b of the NDK, and 23 of the
+ SDK.
+
+- In your copy of the TensorFlow source, update the
+ [WORKSPACE](https://github.com/tensorflow/tensorflow/blob/master/WORKSPACE)
+ file with the location of your SDK and NDK, where it says <PATH_TO_NDK>
+ and <PATH_TO_SDK>.
+
+- Run Bazel to build the demo APK:
+
+ bazel build -c opt //tensorflow/examples/android:tensorflow_demo
+
+- Use [ADB](https://developer.android.com/studio/command-line/adb.html#move) to
+ install the APK onto your device:
+
+ adb install -r bazel-bin/tensorflow/examples/android/tensorflow_demo.apk
+
+Note: In general when compiling for Android with Bazel you need
+`--config=android` on the Bazel command line, though in this case this
+particular example is Android-only, so you don't need it here.
+
+This installs three apps on your phone that are all part of the TensorFlow
+Demo. See [Android Sample Apps](#android_sample_apps) for more information about
+them.
+
+## Android Sample Apps
+
+The
+[Android example code](https://www.tensorflow.org/code/tensorflow/examples/android/) is
+a single project that builds and installs three sample apps which all use the
+same underlying code. The sample apps all take video input from a phone's
+camera:
+
+- **TF Classify** uses the Inception v3 model to label the objects it’s pointed
+ at with classes from Imagenet. There are only 1,000 categories in Imagenet,
+ which misses most everyday objects and includes many things you’re unlikely to
+ encounter often in real life, so the results can often be quite amusing. For
+ example there’s no ‘person’ category, so instead it will often guess things it
+ does know that are often associated with pictures of people, like a seat belt
+ or an oxygen mask. If you do want to customize this example to recognize
+ objects you care about, you can use
+ the
+ [TensorFlow for Poets codelab](https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/index.html#0) as
+ an example for how to train a model based on your own data.
+
+- **TF Detect** uses a multibox model to try to draw bounding boxes around the
+ locations of people in the camera. These boxes are annotated with the
+ confidence for each detection result. Results will not be perfect, as this
+ kind of object detection is still an active research topic. The demo also
+ includes optical tracking for when objects move between frames, which runs
+ more frequently than the TensorFlow inference. This improves the user
+ experience since the apparent frame rate is faster, but it also gives the
+ ability to estimate which boxes refer to the same object between frames, which
+ is important for counting objects over time.
+
+- **TF Stylize** implements a real-time style transfer algorithm on the camera
+ feed. You can select which styles to use and mix between them using the
+ palette at the bottom of the screen, and also switch out the resolution of the
+ processing to go higher or lower rez.
+
+When you build and install the demo, you'll see three app icons on your phone,
+one for each of the demos. Tapping on them should open up the app and let you
+explore what they do. You can enable profiling statistics on-screen by tapping
+the volume up button while they’re running.
+
+### Android Inference Library
+
+Because Android apps need to be written in Java, and core TensorFlow is in C++,
+TensorFlow has a JNI library to interface between the two. Its interface is aimed
+only at inference, so it provides the ability to load a graph, set up inputs,
+and run the model to calculate particular outputs. You can see the full
+documentation for the minimal set of methods in
+[TensorFlowInferenceInterface.java](https://www.tensorflow.org/code/tensorflow/contrib/android/java/org/tensorflow/contrib/android/TensorFlowInferenceInterface.java)
+
+The demos applications use this interface, so they’re a good place to look for
+example usage. You can download prebuilt binary jars
+at
+[ci.tensorflow.org](https://ci.tensorflow.org/view/Nightly/job/nightly-android/).
diff --git a/community/en/docs/tfmobile/index.md b/community/en/docs/tfmobile/index.md
new file mode 100644
index 00000000000..a1d80bfe376
--- /dev/null
+++ b/community/en/docs/tfmobile/index.md
@@ -0,0 +1,299 @@
+# Overview
+
+Warning: TensorFlow Mobile is __deprecated__.
+
+
+
+ TensorFlow Lite is our main
+ mobile and embedded offering. We are
+ working hard to close the feature gap between TensorFlow Mobile and
+ TensorFlow Lite. We expect to deprecate TensorFlow Mobile in early 2019. We
+ will give ample notice to our users when we get to that point and will
+ provide help and support to ensure easy migrations.
+
+
+ In the meantime, please use TensorFlow Lite. If you have a feature request,
+ such as a missing op, please post to our GitHub.
+
+
+
+TensorFlow was designed to be a good deep learning solution for mobile
+platforms. Currently we have two solutions for deploying machine learning
+applications on mobile and embedded devices: TensorFlow for Mobile and
+TensorFlow Lite.
+
+## TensorFlow Lite versus TensorFlow Mobile
+
+Here are a few of the differences between the two:
+
+- TensorFlow Lite is an evolution of TensorFlow Mobile. In most cases, apps
+ developed with TensorFlow Lite will have a smaller binary size, fewer
+ dependencies, and better performance.
+
+- TensorFlow Lite is in developer preview, so not all use cases are covered yet.
+ We expect you to use TensorFlow Mobile to cover production cases.
+
+- TensorFlow Lite supports only a limited set of operators, so not all models
+ will work on it by default. TensorFlow for Mobile has a fuller set of
+ supported functionality.
+
+TensorFlow Lite provides better performance and a small binary size on mobile
+platforms as well as the ability to leverage hardware acceleration if available
+on their platforms. In addition, it has many fewer dependencies so it can be
+built and hosted on simpler, more constrained device scenarios. TensorFlow Lite
+also allows targeting accelerators through the
+[Neural Networks API](https://developer.android.com/ndk/guides/neuralnetworks/index.html).
+
+TensorFlow Lite currently has coverage for a limited set of operators. While
+TensorFlow for Mobile supports only a constrained set of ops by default, in
+principle if you use an arbitrary operator in TensorFlow, it can be customized
+to build that kernel. Thus use cases which are not currently supported by
+TensorFlow Lite should continue to use TensorFlow for Mobile. As TensorFlow Lite
+evolves, it will gain additional operators, and the decision will be easier to
+make.
+
+
+## Introduction to TensorFlow Mobile
+
+TensorFlow was designed from the ground up to be a good deep learning solution
+for mobile platforms like Android and iOS. This mobile guide should help you
+understand how machine learning can work on mobile platforms and how to
+integrate TensorFlow into your mobile apps effectively and efficiently.
+
+## About this Guide
+
+This guide is aimed at developers who have a TensorFlow model that’s
+successfully working in a desktop environment, who want to integrate it into
+a mobile application, and cannot use TensorFlow Lite. Here are the
+main challenges you’ll face during that process:
+
+- Understanding how to use Tensorflow for mobile.
+- Building TensorFlow for your platform.
+- Integrating the TensorFlow library into your application.
+- Preparing your model file for mobile deployment.
+- Optimizing for latency, RAM usage, model file size, and binary size.
+
+## Common use cases for mobile machine learning
+
+**Why run TensorFlow on mobile?**
+
+Traditionally, deep learning has been associated with data centers and giant
+clusters of high-powered GPU machines. However, it can be very expensive and
+time-consuming to send all of the data a device has access to across a network
+connection. Running on mobile makes it possible to deliver very interactive
+applications in a way that’s not possible when you have to wait for a network
+round trip.
+
+Here are some common use cases for on-device deep learning:
+
+### Speech Recognition
+
+There are a lot of interesting applications that can be built with a
+speech-driven interface, and many of these require on-device processing. Most of
+the time a user isn’t giving commands, and so streaming audio continuously to a
+remote server would be a waste of bandwidth, since it would mostly be silence or
+background noises. To solve this problem it’s common to have a small neural
+network running on-device
+[listening out for a particular keyword](https://www.tensorflow.org/tutorials/sequences/audio_recognition).
+Once that keyword has been spotted, the rest of the
+conversation can be transmitted over to the server for further processing if
+more computing power is needed.
+
+### Image Recognition
+
+It can be very useful for a mobile app to be able to make sense of a camera
+image. If your users are taking photos, recognizing what’s in them can help your
+camera apps apply appropriate filters, or label the photos so they’re easily
+findable. It’s important for embedded applications too, since you can use image
+sensors to detect all sorts of interesting conditions, whether it’s spotting
+endangered animals in the wild
+or
+[reporting how late your train is running](https://svds.com/tensorflow-image-recognition-raspberry-pi/).
+
+TensorFlow comes with several examples of recognizing the types of objects
+inside images along with a variety of different pre-trained models, and they can
+all be run on mobile devices. You can try out
+our
+[Tensorflow for Poets](https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/index.html#0) and
+[Tensorflow for Poets 2: Optimize for Mobile](https://codelabs.developers.google.com/codelabs/tensorflow-for-poets-2/index.html#0) codelabs to
+see how to take a pretrained model and run some very fast and lightweight
+training to teach it to recognize specific objects, and then optimize it to
+run on mobile.
+
+### Object Localization
+
+Sometimes it’s important to know where objects are in an image as well as what
+they are. There are lots of augmented reality use cases that could benefit a
+mobile app, such as guiding users to the right component when offering them
+help fixing their wireless network or providing informative overlays on top of
+landscape features. Embedded applications often need to count objects that are
+passing by them, whether it’s pests in a field of crops, or people, cars and
+bikes going past a street lamp.
+
+TensorFlow offers a pretrained model for drawing bounding boxes around people
+detected in images, together with tracking code to follow them over time. The
+tracking is especially important for applications where you’re trying to count
+how many objects are present over time, since it gives you a good idea when a
+new object enters or leaves the scene. We have some sample code for this
+available for Android [on
+GitHub](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android),
+and also a [more general object detection
+model](https://www.tensorflow.org/code/tensorflow_models/object_detection/README.md)
+available as well.
+
+### Gesture Recognition
+
+It can be useful to be able to control applications with hand or other
+gestures, either recognized from images or through analyzing accelerometer
+sensor data. Creating those models is beyond the scope of this guide, but
+TensorFlow is an effective way of deploying them.
+
+### Optical Character Recognition
+
+Google Translate’s live camera view is a great example of how effective
+interactive on-device detection of text can be.
+
+
+
+
+
+There are multiple steps involved in recognizing text in images. You first have
+to identify the areas where the text is present, which is a variation on the
+object localization problem, and can be solved with similar techniques. Once you
+have an area of text, you then need to interpret it as letters, and then use a
+language model to help guess what words they represent. The simplest way to
+estimate what letters are present is to segment the line of text into individual
+letters, and then apply a simple neural network to the bounding box of each. You
+can get good results with the kind of models used for MNIST, which you can find
+in TensorFlow’s tutorials, though you may want a higher-resolution input. A
+more advanced alternative is to use an LSTM model to process a whole line of
+text at once, with the model itself handling the segmentation into different
+characters.
+
+### Translation
+
+Translating from one language to another quickly and accurately, even if you
+don’t have a network connection, is an important use case. Deep networks are
+very effective at this sort of task, and you can find descriptions of a lot of
+different models in the literature. Often these are sequence-to-sequence
+recurrent models where you’re able to run a single graph to do the whole
+translation, without needing to run separate parsing stages.
+
+### Text Classification
+
+If you want to suggest relevant prompts to users based on what they’re typing or
+reading, it can be very useful to understand the meaning of the text. This is
+where text classification comes in. Text classification is an umbrella term
+that covers everything from sentiment analysis to topic discovery. You’re likely
+to have your own categories or labels that you want to apply, so the best place
+to start is with an example
+like
+[Skip-Thoughts](https://www.tensorflow.org/code/tensorflow_models/skip_thoughts/),
+and then train on your own examples.
+
+### Voice Synthesis
+
+A synthesized voice can be a great way of giving users feedback or aiding
+accessibility, and recent advances such as
+[WaveNet](https://deepmind.com/blog/wavenet-generative-model-raw-audio/) show
+that deep learning can offer very natural-sounding speech.
+
+## Mobile machine learning and the cloud
+
+These examples of use cases give an idea of how on-device networks can
+complement cloud services. Cloud has a great deal of computing power in a
+controlled environment, but running on devices can offer higher interactivity.
+In situations where the cloud is unavailable, or your cloud capacity is limited,
+you can provide an offline experience, or reduce cloud workload by processing
+easy cases on device.
+
+Doing on-device computation can also signal when it's time to switch to working
+on the cloud. A good example of this is hotword detection in speech. Since
+devices are able to constantly listen out for the keywords, this then triggers a
+lot of traffic to cloud-based speech recognition once one is recognized. Without
+the on-device component, the whole application wouldn’t be feasible, and this
+pattern exists across several other applications as well. Recognizing that some
+sensor input is interesting enough for further processing makes a lot of
+interesting products possible.
+
+## What hardware and software should you have?
+
+TensorFlow runs on Ubuntu Linux, Windows 10, and OS X. For a list of all
+supported operating systems and instructions to install TensorFlow, see
+Installing Tensorflow.
+
+Note that some of the sample code we provide for mobile TensorFlow requires you
+to compile TensorFlow from source, so you’ll need more than just `pip install`
+to work through all the sample code.
+
+To try out the mobile examples, you’ll need a device set up for development,
+using
+either [Android Studio](https://developer.android.com/studio/install.html),
+or [XCode](https://developer.apple.com/xcode/) if you're developing for iOS.
+
+## What should you do before you get started?
+
+Before thinking about how to get your solution on mobile:
+
+1. Determine whether your problem is solvable by mobile machine learning
+2. Create a labelled dataset to define your problem
+3. Pick an effective model for the problem
+
+We'll discuss these in more detail below.
+
+### Is your problem solvable by mobile machine learning?
+
+Once you have an idea of the problem you want to solve, you need to make a plan
+of how to build your solution. The most important first step is making sure that
+your problem is actually solvable, and the best way to do that is to mock it up
+using humans in the loop.
+
+For example, if you want to drive a robot toy car using voice commands, try
+recording some audio from the device and listen back to it to see if you can
+make sense of what’s being said. Often you’ll find there are problems in the
+capture process, such as the motor drowning out speech or not being able to hear
+at a distance, and you should tackle these problems before investing in the
+modeling process.
+
+Another example would be giving photos taken from your app to people see if they
+can classify what’s in them, in the way you’re looking for. If they can’t do
+that (for example, trying to estimate calories in food from photos may be
+impossible because all white soups look the same), then you’ll need to redesign
+your experience to cope with that. A good rule of thumb is that if a human can’t
+handle the task then it will be difficult to train a computer to do better.
+
+### Create a labelled dataset
+
+After you’ve solved any fundamental issues with your use case, you need to
+create a labeled dataset to define what problem you’re trying to solve. This
+step is extremely important, more than picking which model to use. You want it
+to be as representative as possible of your actual use case, since the model
+will only be effective at the task you teach it. It’s also worth investing in
+tools to make labeling the data as efficient and accurate as possible. For
+example, if you’re able to switch from having to click a button on a web
+interface to simple keyboard shortcuts, you may be able to speed up the
+generation process a lot. You should also start by doing the initial labeling
+yourself, so you can learn about the difficulties and likely errors, and
+possibly change your labeling or data capture process to avoid them. Once you
+and your team are able to consistently label examples (that is once you
+generally agree on the same labels for most examples), you can then try and
+capture your knowledge in a manual and teach external raters how to run the same
+process.
+
+### Pick an effective model
+
+The next step is to pick an effective model to use. You might be able to avoid
+training a model from scratch if someone else has already implemented a model
+similar to what you need; we have a repository of models implemented in
+TensorFlow [on GitHub](https://github.com/tensorflow/models) that you can look
+through. Lean towards the simplest model you can find, and try to get started as
+soon as you have even a small amount of labelled data, since you’ll get the best
+results when you’re able to iterate quickly. The shorter the time it takes to
+try training a model and running it in its real application, the better overall
+results you’ll see. It’s common for an algorithm to get great training accuracy
+numbers but then fail to be useful within a real application because there’s a
+mismatch between the dataset and real usage. Prototype end-to-end usage as soon
+as possible to create a consistent user experience.
diff --git a/community/en/docs/tfmobile/ios_build.md b/community/en/docs/tfmobile/ios_build.md
new file mode 100644
index 00000000000..080ba660214
--- /dev/null
+++ b/community/en/docs/tfmobile/ios_build.md
@@ -0,0 +1,125 @@
+# Building TensorFlow on iOS
+
+Warning: TensorFlow Mobile is __deprecated__.
+
+
+
+ TensorFlow Lite is our main
+ mobile and embedded offering. We are
+ working hard to close the feature gap between TensorFlow Mobile and
+ TensorFlow Lite. We expect to deprecate TensorFlow Mobile in early 2019. We
+ will give ample notice to our users when we get to that point and will
+ provide help and support to ensure easy migrations.
+
+
+ In the meantime, please use TensorFlow Lite. If you have a feature request,
+ such as a missing op, please post to our GitHub.
+
+
+
+## Using CocoaPods
+
+The simplest way to get started with TensorFlow on iOS is using the CocoaPods
+package management system. You can add the `TensorFlow-experimental` pod to your
+Podfile, which installs a universal binary framework. This makes it easy to get
+started but has the disadvantage of being hard to customize, which is important
+in case you want to shrink your binary size. If you do need the ability to
+customize your libraries, see later sections on how to do that.
+
+## Creating your own app
+
+If you'd like to add TensorFlow capabilities to your own app, do the following:
+
+- Create your own app or load your already-created app in XCode.
+
+- Add a file named Podfile at the project root directory with the following content:
+
+ target 'YourProjectName'
+ pod 'TensorFlow-experimental'
+
+- Run `pod install` to download and install the `TensorFlow-experimental` pod.
+
+- Open `YourProjectName.xcworkspace` and add your code.
+
+- In your app's **Build Settings**, make sure to add `$(inherited)` to the
+ **Other Linker Flags**, and **Header Search Paths** sections.
+
+## Running the Samples
+
+You'll need Xcode 7.3 or later to run our iOS samples.
+
+There are currently three examples: simple, benchmark, and camera. For now, you
+can download the sample code by cloning the main tensorflow repository (we are
+planning to make the samples available as a separate repository later).
+
+From the root of the tensorflow folder, download [Inception
+v1](https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip),
+and extract the label and graph files into the data folders inside both the
+simple and camera examples using these steps:
+
+ mkdir -p ~/graphs
+ curl -o ~/graphs/inception5h.zip \
+ https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip \
+ && unzip ~/graphs/inception5h.zip -d ~/graphs/inception5h
+ cp ~/graphs/inception5h/* tensorflow/examples/ios/benchmark/data/
+ cp ~/graphs/inception5h/* tensorflow/examples/ios/camera/data/
+ cp ~/graphs/inception5h/* tensorflow/examples/ios/simple/data/
+
+Change into one of the sample directories, download the
+[Tensorflow-experimental](https://cocoapods.org/pods/TensorFlow-experimental)
+pod, and open the Xcode workspace. Note that installing the pod can take a long
+time since it is big (~450MB). If you want to run the simple example, then:
+
+ cd tensorflow/examples/ios/simple
+ pod install
+ open tf_simple_example.xcworkspace # note .xcworkspace, not .xcodeproj
+ # this is created by pod install
+
+Run the simple app in the XCode simulator. You should see a single-screen app
+with a **Run Model** button. Tap that, and you should see some debug output
+appear below indicating that the example Grace Hopper image in directory data
+has been analyzed, with a military uniform recognized.
+
+Run the other samples using the same process. The camera example requires a real
+device connected. Once you build and run that, you should get a live camera view
+that you can point at objects to get real-time recognition results.
+
+### iOS Example details
+
+There are three demo applications for iOS, all defined in Xcode projects inside
+[tensorflow/examples/ios](https://www.tensorflow.org/code/tensorflow/examples/ios/).
+
+- **Simple**: This is a minimal example showing how to load and run a TensorFlow
+ model in as few lines as possible. It just consists of a single view with a
+ button that executes the model loading and inference when its pressed.
+
+- **Camera**: This is very similar to the Android TF Classify demo. It loads
+ Inception v3 and outputs its best label estimate for what’s in the live camera
+ view. As with the Android version, you can train your own custom model using
+ TensorFlow for Poets and drop it into this example with minimal code changes.
+
+- **Benchmark**: is quite close to Simple, but it runs the graph repeatedly and
+ outputs similar statistics to the benchmark tool on Android.
+
+
+### Troubleshooting
+
+- Make sure you use the TensorFlow-experimental pod (and not TensorFlow).
+
+- The TensorFlow-experimental pod is current about ~450MB. The reason it is so
+ big is because we are bundling multiple platforms, and the pod includes all
+ TensorFlow functionality (e.g. operations). The final app size after build is
+ substantially smaller though (~25MB). Working with the complete pod is
+ convenient during development, but see below section on how you can build your
+ own custom TensorFlow library to reduce the size.
+
+## Building the TensorFlow iOS libraries from source
+
+While Cocoapods is the quickest and easiest way of getting started, you sometimes
+need more flexibility to determine which parts of TensorFlow your app should be
+shipped with. For such cases, you can build the iOS libraries from the
+sources. [This
+guide](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/ios#building-the-tensorflow-ios-libraries-from-source)
+contains detailed instructions on how to do that.
+
diff --git a/community/en/docs/tfmobile/linking_libs.md b/community/en/docs/tfmobile/linking_libs.md
new file mode 100644
index 00000000000..0f40984245f
--- /dev/null
+++ b/community/en/docs/tfmobile/linking_libs.md
@@ -0,0 +1,271 @@
+# Integrating TensorFlow libraries
+
+Warning: TensorFlow Mobile is __deprecated__.
+
+
+
+ TensorFlow Lite is our main
+ mobile and embedded offering. We are
+ working hard to close the feature gap between TensorFlow Mobile and
+ TensorFlow Lite. We expect to deprecate TensorFlow Mobile in early 2019. We
+ will give ample notice to our users when we get to that point and will
+ provide help and support to ensure easy migrations.
+
+
+ In the meantime, please use TensorFlow Lite. If you have a feature request,
+ such as a missing op, please post to our GitHub.
+
+
+
+Once you have made some progress on a model that addresses the problem you’re
+trying to solve, it’s important to test it out inside your application
+immediately. There are often unexpected differences between your training data
+and what users actually encounter in the real world, and getting a clear picture
+of the gap as soon as possible improves the product experience.
+
+This page talks about how to integrate the TensorFlow libraries into your own
+mobile applications, once you have already successfully built and deployed the
+TensorFlow mobile demo apps.
+
+## Linking the library
+
+After you've managed to build the examples, you'll probably want to call
+TensorFlow from one of your existing applications. The very easiest way to do
+this is to use the Pod installation steps described in
+Building TensorFlow on iOS, but if you want to build
+TensorFlow from source (for example to customize which operators are included)
+you'll need to break out TensorFlow as a framework, include the right header
+files, and link against the built libraries and dependencies.
+
+### Android
+
+For Android, you just need to link in a Java library contained in a JAR file
+called `libandroid_tensorflow_inference_java.jar`. There are three ways to
+include this functionality in your program:
+
+1. Include the jcenter AAR which contains it, as in this
+ [example app](https://github.com/googlecodelabs/tensorflow-for-poets-2/blob/master/android/tfmobile/build.gradle#L59-L65)
+
+2. Download the nightly precompiled version from
+[ci.tensorflow.org](http://ci.tensorflow.org/view/Nightly/job/nightly-android/lastSuccessfulBuild/artifact/out/).
+
+3. Build the JAR file yourself using the instructions [in our Android GitHub repo](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/android)
+
+### iOS
+
+Pulling in the TensorFlow libraries on iOS is a little more complicated. Here is
+a checklist of what you’ll need to do to your iOS app:
+
+- Link against tensorflow/contrib/makefile/gen/lib/libtensorflow-core.a, usually
+ by adding `-L/your/path/tensorflow/contrib/makefile/gen/lib/` and
+ `-ltensorflow-core` to your linker flags.
+
+- Link against the generated protobuf libraries by adding
+ `-L/your/path/tensorflow/contrib/makefile/gen/protobuf_ios/lib` and
+ `-lprotobuf` and `-lprotobuf-lite` to your command line.
+
+- For the include paths, you need the root of your TensorFlow source folder as
+ the first entry, followed by
+ `tensorflow/contrib/makefile/downloads/protobuf/src`,
+ `tensorflow/contrib/makefile/downloads`,
+ `tensorflow/contrib/makefile/downloads/eigen`, and
+ `tensorflow/contrib/makefile/gen/proto`.
+
+- Make sure your binary is built with `-force_load` (or the equivalent on your
+ platform), aimed at the TensorFlow library to ensure that it’s linked
+ correctly. More detail on why this is necessary can be found in the next
+ section, [Global constructor magic](#global_constructor_magic). On Linux-like
+ platforms, you’ll need different flags, more like
+ `-Wl,--allow-multiple-definition -Wl,--whole-archive`.
+
+You’ll also need to link in the Accelerator framework, since this is used to
+speed up some of the operations.
+
+## Global constructor magic
+
+One of the subtlest problems you may run up against is the “No session factory
+registered for the given session options” error when trying to call TensorFlow
+from your own application. To understand why this is happening and how to fix
+it, you need to know a bit about the architecture of TensorFlow.
+
+The framework is designed to be very modular, with a thin core and a large
+number of specific objects that are independent and can be mixed and matched as
+needed. To enable this, the coding pattern in C++ had to let modules easily
+notify the framework about the services they offer, without requiring a central
+list that has to be updated separately from each implementation. It also had to
+allow separate libraries to add their own implementations without needing a
+recompile of the core.
+
+To achieve this capability, TensorFlow uses a registration pattern in a lot of
+places. In the code, it looks like this:
+
+```
+class MulKernel : OpKernel {
+ Status Compute(OpKernelContext* context) { … }
+};
+REGISTER_KERNEL(MulKernel, “Mul”);
+```
+
+This would be in a standalone `.cc` file linked into your application, either
+as part of the main set of kernels or as a separate custom library. The magic
+part is that the `REGISTER_KERNEL()` macro is able to inform the core of
+TensorFlow that it has an implementation of the Mul operation, so that it can be
+called in any graphs that require it.
+
+From a programming point of view, this setup is very convenient. The
+implementation and registration code live in the same file, and adding new
+implementations is as simple as compiling and linking it in. The difficult part
+comes from the way that the `REGISTER_KERNEL()` macro is implemented. C++
+doesn’t offer a good mechanism for doing this sort of registration, so we have
+to resort to some tricky code. Under the hood, the macro is implemented so that
+it produces something like this:
+
+```
+class RegisterMul {
+ public:
+ RegisterMul() {
+ global_kernel_registry()->Register(“Mul”, [](){
+ return new MulKernel()
+ });
+ }
+};
+RegisterMul g_register_mul;
+```
+
+This sets up a class `RegisterMul` with a constructor that tells the global
+kernel registry what function to call when somebody asks it how to create a
+“Mul” kernel. Then there’s a global object of that class, and so the constructor
+should be called at the start of any program.
+
+While this may sound sensible, the unfortunate part is that the global object
+that’s defined is not used by any other code, so linkers not designed with this
+in mind will decide that it can be deleted. As a result, the constructor is
+never called, and the class is never registered. All sorts of modules use this
+pattern in TensorFlow, and it happens that `Session` implementations are the
+first to be looked for when the code is run, which is why it shows up as the
+characteristic error when this problem occurs.
+
+The solution is to force the linker to not strip any code from the library, even
+if it believes it’s unused. On iOS, this step can be accomplished with the
+`-force_load` flag, specifying a library path, and on Linux you need
+`--whole-archive`. These persuade the linker to not be as aggressive about
+stripping, and should retain the globals.
+
+The actual implementation of the various `REGISTER_*` macros is a bit more
+complicated in practice, but they all suffer the same underlying problem. If
+you’re interested in how they work, [op_kernel.h](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/op_kernel.h#L1091)
+is a good place to start investigating.
+
+## Protobuf problems
+
+TensorFlow relies on
+the [Protocol Buffer](https://developers.google.com/protocol-buffers/) library,
+commonly known as protobuf. This library takes definitions of data structures
+and produces serialization and access code for them in a variety of
+languages. The tricky part is that this generated code needs to be linked
+against shared libraries for the exact same version of the framework that was
+used for the generator. This can be an issue when `protoc`, the tool used to
+generate the code, is from a different version of protobuf than the libraries in
+the standard linking and include paths. For example, you might be using a copy
+of `protoc` that was built locally in `~/projects/protobuf-3.0.1.a`, but you have
+libraries installed at `/usr/local/lib` and `/usr/local/include` that are from
+3.0.0.
+
+The symptoms of this issue are errors during the compilation or linking phases
+with protobufs. Usually, the build tools take care of this, but if you’re using
+the makefile, make sure you’re building the protobuf library locally and using
+it, as shown in [this Makefile](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/makefile/Makefile#L18).
+
+Another situation that can cause problems is when protobuf headers and source
+files need to be generated as part of the build process. This process makes
+building more complex, since the first phase has to be a pass over the protobuf
+definitions to create all the needed code files, and only after that can you go
+ahead and do a build of the library code.
+
+### Multiple versions of protobufs in the same app
+
+Protobufs generate headers that are needed as part of the C++ interface to the
+overall TensorFlow library. This complicates using the library as a standalone
+framework.
+
+If your application is already using version 1 of the protocol buffers library,
+you may have trouble integrating TensorFlow because it requires version 2. If
+you just try to link both versions into the same binary, you’ll see linking
+errors because some of the symbols clash. To solve this particular problem, we
+have an experimental script at [rename_protobuf.sh](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/makefile/rename_protobuf.sh).
+
+You need to run this as part of the makefile build, after you’ve downloaded all
+the dependencies:
+
+```
+tensorflow/contrib/makefile/download_dependencies.sh
+tensorflow/contrib/makefile/rename_protobuf.sh
+```
+
+## Calling the TensorFlow API
+
+Once you have the framework available, you then need to call into it. The usual
+pattern is that you first load your model, which represents a preset set of
+numeric computations, and then you run inputs through that model (for example,
+images from a camera) and receive outputs (for example, predicted labels).
+
+On Android, we provide the Java Inference Library that is focused on just this
+use case, while on iOS and Raspberry Pi you call directly into the C++ API.
+
+### Android
+
+Here’s what a typical Inference Library sequence looks like on Android:
+
+```
+// Load the model from disk.
+TensorFlowInferenceInterface inferenceInterface =
+new TensorFlowInferenceInterface(assetManager, modelFilename);
+
+// Copy the input data into TensorFlow.
+inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3);
+
+// Run the inference call.
+inferenceInterface.run(outputNames, logStats);
+
+// Copy the output Tensor back into the output array.
+inferenceInterface.fetch(outputName, outputs);
+```
+
+You can find the source of this code in the [Android examples](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/android/src/org/tensorflow/demo/TensorFlowImageClassifier.java#L107).
+
+### iOS and Raspberry Pi
+
+Here’s the equivalent code for iOS and Raspberry Pi:
+
+```
+// Load the model.
+PortableReadFileToProto(file_path, &tensorflow_graph);
+
+// Create a session from the model.
+tensorflow::Status s = session->Create(tensorflow_graph);
+if (!s.ok()) {
+ LOG(FATAL) << "Could not create TensorFlow Graph: " << s;
+}
+
+// Run the model.
+std::string input_layer = "input";
+std::string output_layer = "output";
+std::vector outputs;
+tensorflow::Status run_status = session->Run({\{input_layer, image_tensor}},
+ {output_layer}, {}, &outputs);
+if (!run_status.ok()) {
+ LOG(FATAL) << "Running model failed: " << run_status;
+}
+
+// Access the output data.
+tensorflow::Tensor* output = &outputs[0];
+```
+
+This is all based on the
+[iOS sample code](https://www.tensorflow.org/code/tensorflow/examples/ios/simple/RunModelViewController.mm),
+but there’s nothing iOS-specific; the same code should be usable on any platform
+that supports C++.
+
+You can also find specific examples for Raspberry Pi
+[here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/pi_examples/label_image/label_image.cc).
diff --git a/community/en/docs/tfmobile/optimizing.md b/community/en/docs/tfmobile/optimizing.md
new file mode 100644
index 00000000000..1b3f3e60318
--- /dev/null
+++ b/community/en/docs/tfmobile/optimizing.md
@@ -0,0 +1,519 @@
+# Optimizing for mobile
+
+Warning: TensorFlow Mobile is __deprecated__.
+
+
+
+ TensorFlow Lite is our main
+ mobile and embedded offering. We are
+ working hard to close the feature gap between TensorFlow Mobile and
+ TensorFlow Lite. We expect to deprecate TensorFlow Mobile in early 2019. We
+ will give ample notice to our users when we get to that point and will
+ provide help and support to ensure easy migrations.
+
+
+ In the meantime, please use TensorFlow Lite. If you have a feature request,
+ such as a missing op, please post to our GitHub.
+
+
+
+There are some special issues that you have to deal with when you’re trying to
+ship on mobile or embedded devices, and you’ll need to think about these as
+you’re developing your model.
+
+These issues are:
+
+- Model and Binary Size
+- App speed and model loading speed
+- Performance and threading
+
+We'll discuss a few of these below.
+
+## What are the minimum device requirements for TensorFlow?
+
+You need at least one megabyte of program memory and several megabytes of RAM to
+run the base TensorFlow runtime, so it’s not suitable for DSPs or
+microcontrollers. Other than those, the biggest constraint is usually the
+calculation speed of the device, and whether you can run the model you need for
+your application with a low enough latency. You can use the benchmarking tools
+in [How to Profile your Model](#how_to_profile_your_model) to get an idea of how
+many FLOPs are required for a model, and then use that to make rule-of-thumb
+estimates of how fast they will run on different devices. For example, a modern
+smartphone might be able to run 10 GFLOPs per second, so the best you could hope
+for from a 5 GFLOP model is two frames per second, though you may do worse
+depending on what the exact computation patterns are.
+
+This model dependence means that it’s possible to run TensorFlow even on very
+old or constrained phones, as long as you optimize your network to fit within
+the latency budget and possibly within limited RAM too. For memory usage, you
+mostly need to make sure that the intermediate buffers that TensorFlow creates
+aren’t too large, which you can examine in the benchmark output too.
+
+## Speed
+
+One of the highest priorities of most model deployments is figuring out how to
+run the inference fast enough to give a good user experience. The first place to
+start is by looking at the total number of floating point operations that are
+required to execute the graph. You can get a very rough estimate of this by
+using the `benchmark_model` tool:
+
+ bazel build -c opt tensorflow/tools/benchmark:benchmark_model && \
+ bazel-bin/tensorflow/tools/benchmark/benchmark_model \
+ --graph=/tmp/inception_graph.pb --input_layer="Mul:0" \
+ --input_layer_shape="1,299,299,3" --input_layer_type="float" \
+ --output_layer="softmax:0" --show_run_order=false --show_time=false \
+ --show_memory=false --show_summary=true --show_flops=true --logtostderr
+
+This should show you an estimate of how many operations are needed to run the
+graph. You can then use that information to figure out how feasible your model
+is to run on the devices you’re targeting. For an example, a high-end phone from
+2016 might be able to do 20 billion FLOPs per second, so the best speed you
+could hope for from a model that requires 10 billion FLOPs is around 500ms. On a
+device like the Raspberry Pi 3 that can do about 5 billion FLOPs, you may only
+get one inference every two seconds.
+
+Having this estimate helps you plan for what you’ll be able to realistically
+achieve on a device. If the model is using too many ops, then there are a lot of
+opportunities to optimize the architecture to reduce that number.
+
+Advanced techniques include [SqueezeNet](https://arxiv.org/abs/1602.07360)
+and [MobileNet](https://arxiv.org/abs/1704.04861), which are architectures
+designed to produce models for mobile -- lean and fast but with a small accuracy
+cost. You can also just look at alternative models, even older ones, which may
+be smaller. For example, Inception v1 only has around 7 million parameters,
+compared to Inception v3’s 24 million, and requires only 3 billion FLOPs rather
+than 9 billion for v3.
+
+## Model Size
+
+Models that run on a device need to be stored somewhere on the device, and very
+large neural networks can be hundreds of megabytes. Most users are reluctant to
+download very large app bundles from app stores, so you want to make your model
+as small as possible. Furthermore, smaller neural networks can persist in and
+out of a mobile device's memory faster.
+
+To understand how large your network will be on disk, start by looking at the
+size on disk of your `GraphDef` file after you’ve run `freeze_graph` and
+`strip_unused_nodes` on it (see Preparing models for
+more details on these tools), since then it should only contain
+inference-related nodes. To double-check that your results are as expected, run
+the `summarize_graph` tool to see how many parameters are in constants:
+
+ bazel build tensorflow/tools/graph_transforms:summarize_graph && \
+ bazel-bin/tensorflow/tools/graph_transforms/summarize_graph \
+ --in_graph=/tmp/tensorflow_inception_graph.pb
+
+That command should give you output that looks something like this:
+
+ No inputs spotted.
+ Found 1 possible outputs: (name=softmax, op=Softmax)
+ Found 23885411 (23.89M) const parameters, 0 (0) variable parameters,
+ and 99 control_edges
+ Op types used: 489 Const, 99 CheckNumerics, 99 Identity, 94
+ BatchNormWithGlobalNormalization, 94 Conv2D, 94 Relu, 11 Concat, 9 AvgPool,
+ 5 MaxPool, 1 Sub, 1 Softmax, 1 ResizeBilinear, 1 Reshape, 1 Mul, 1 MatMul,
+ 1 ExpandDims, 1 DecodeJpeg, 1 Cast, 1 BiasAdd
+
+The important part for our current purposes is the number of const
+parameters. In most models these will be stored as 32-bit floats to start, so if
+you multiply the number of const parameters by four, you should get something
+that’s close to the size of the file on disk. You can often get away with only
+eight-bits per parameter with very little loss of accuracy in the final result,
+so if your file size is too large you can try using
+quantize_weights
+to transform the parameters down.
+
+ bazel build tensorflow/tools/graph_transforms:transform_graph && \
+ bazel-bin/tensorflow/tools/graph_transforms/transform_graph \
+ --in_graph=/tmp/tensorflow_inception_optimized.pb \
+ --out_graph=/tmp/tensorflow_inception_quantized.pb \
+ --inputs='Mul:0' --outputs='softmax:0' --transforms='quantize_weights'
+
+If you look at the resulting file size, you should see that it’s about a quarter
+of the original at 23MB.
+
+Another transform is `round_weights`, which doesn't make the file smaller, but it
+makes the file compressible to about the same size as when `quantize_weights` is
+used. This is particularly useful for mobile development, taking advantage of
+the fact that app bundles are compressed before they’re downloaded by consumers.
+
+The original file does not compress well with standard algorithms, because the
+bit patterns of even very similar numbers can be very different. The
+`round_weights` transform keeps the weight parameters stored as floats, but
+rounds them to a set number of step values. This means there are a lot more
+repeated byte patterns in the stored model, and so compression can often bring
+the size down dramatically, in many cases to near the size it would be if they
+were stored as eight bit.
+
+Another advantage of `round_weights` is that the framework doesn’t have to
+allocate a temporary buffer to unpack the parameters into, as we have to when
+we just use `quantize_weights`. This saves a little bit of latency (though the
+results should be cached so it’s only costly on the first run) and makes it
+possible to use memory mapping, as described later.
+
+## Binary Size
+
+One of the biggest differences between mobile and server development is the
+importance of binary size. On desktop machines it’s not unusual to have
+executables that are hundreds of megabytes on disk, but for mobile and embedded
+apps it’s vital to keep the binary as small as possible so that user downloads
+are easy. As mentioned above, TensorFlow only includes a subset of op
+implementations by default, but this still results in a 12 MB final
+executable. To reduce this, you can set up the library to only include the
+implementations of the ops that you actually need, based on automatically
+analyzing your model. To use it:
+
+- Run `tools/print_required_ops/print_selective_registration_header.py` on your
+ model to produce a header file that only enables the ops it uses.
+
+- Place the `ops_to_register.h` file somewhere that the compiler can find
+ it. This can be in the root of your TensorFlow source folder.
+
+- Build TensorFlow with `SELECTIVE_REGISTRATION` defined, for example by passing
+ in `--copts=”-DSELECTIVE_REGISTRATION”` to your Bazel build command.
+
+This process recompiles the library so that only the needed ops and types are
+included, which can dramatically reduce the executable size. For example, with
+Inception v3, the new size is only 1.5MB.
+
+## How to Profile your Model
+
+Once you have an idea of what your device's peak performance range is, it’s
+worth looking at its actual current performance. Using a standalone TensorFlow
+benchmark, rather than running it inside a larger app, helps isolate just the
+Tensorflow contribution to the
+latency. The
+[tensorflow/tools/benchmark](https://www.tensorflow.org/code/tensorflow/tools/benchmark/) tool
+is designed to help you do this. To run it on Inception v3 on your desktop
+machine, build this benchmark model:
+
+ bazel build -c opt tensorflow/tools/benchmark:benchmark_model && \
+ bazel-bin/tensorflow/tools/benchmark/benchmark_model \
+ --graph=/tmp/tensorflow_inception_graph.pb --input_layer="Mul" \
+ --input_layer_shape="1,299,299,3" --input_layer_type="float" \
+ --output_layer="softmax:0" --show_run_order=false --show_time=false \
+ --show_memory=false --show_summary=true --show_flops=true --logtostderr
+
+You should see output that looks something like this:
+
+
+============================== Top by Computation Time ==============================
+[node
+ type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [Name]
+Conv2D 22.859 14.212 13.700 4.972% 4.972% 3871.488 conv_4/Conv2D
+Conv2D 8.116 8.964 11.315 4.106% 9.078% 5531.904 conv_2/Conv2D
+Conv2D 62.066 16.504 7.274 2.640% 11.717% 443.904 mixed_3/conv/Conv2D
+Conv2D 2.530 6.226 4.939 1.792% 13.510% 2765.952 conv_1/Conv2D
+Conv2D 55.585 4.605 4.665 1.693% 15.203% 313.600 mixed_2/tower/conv_1/Conv2D
+Conv2D 127.114 5.469 4.630 1.680% 16.883% 81.920 mixed_10/conv/Conv2D
+Conv2D 47.391 6.994 4.588 1.665% 18.548% 313.600 mixed_1/tower/conv_1/Conv2D
+Conv2D 39.463 7.878 4.336 1.574% 20.122% 313.600 mixed/tower/conv_1/Conv2D
+Conv2D 127.113 4.192 3.894 1.413% 21.535% 114.688 mixed_10/tower_1/conv/Conv2D
+Conv2D 70.188 5.205 3.626 1.316% 22.850% 221.952 mixed_4/conv/Conv2D
+
+============================== Summary by node type ==============================
+[Node type] [count] [avg ms] [avg %] [cdf %] [mem KB]
+Conv2D 94 244.899 88.952% 88.952% 35869.953
+BiasAdd 95 9.664 3.510% 92.462% 35873.984
+AvgPool 9 7.990 2.902% 95.364% 7493.504
+Relu 94 5.727 2.080% 97.444% 35869.953
+MaxPool 5 3.485 1.266% 98.710% 3358.848
+Const 192 1.727 0.627% 99.337% 0.000
+Concat 11 1.081 0.393% 99.730% 9892.096
+MatMul 1 0.665 0.242% 99.971% 4.032
+Softmax 1 0.040 0.015% 99.986% 4.032
+<> 1 0.032 0.012% 99.997% 0.000
+Reshape 1 0.007 0.003% 100.000% 0.000
+
+Timings (microseconds): count=50 first=330849 curr=274803 min=232354 max=415352 avg=275563 std=44193
+Memory (bytes): count=50 curr=128366400(all same)
+514 nodes defined 504 nodes observed
+
+
+This is the summary view, which is enabled by the show_summary flag. To
+interpret it, the first table is a list of the nodes that took the most time, in
+order by how long they took. From left to right, the columns are:
+
+- Node type, what kind of operation this was.
+
+- Start time of the op, showing where it falls in the sequence of operations.
+
+- First time in milliseconds. This is how long the operation took on the first
+ run of the benchmark, since by default 20 runs are executed to get more
+ reliable statistics. The first time is useful to spot which ops are doing
+ expensive calculations on the first run, and then caching the results.
+
+- Average time for the operation across all runs, in milliseconds.
+
+- What percentage of the total time for one run the op took. This is useful to
+ understand where the hotspots are.
+
+- The cumulative total time of this and the previous ops in the table. This is
+ handy for understanding what the distribution of work is across the layers, to
+ see if just a few of the nodes are taking up most of the time.
+
+- The amount of memory consumed by outputs of this type of op.
+
+- Name of the node.
+
+The second table is similar, but instead of breaking down the timings by
+particular named nodes, it groups them by the kind of op. This is very useful to
+understand which op implementations you might want to optimize or eliminate from
+your graph. The table is arranged with the most costly operations at the start,
+and only shows the top ten entries, with a placeholder for other nodes. The
+columns from left to right are:
+
+- Type of the nodes being analyzed.
+
+- Accumulated average time taken by all nodes of this type, in milliseconds.
+
+- What percentage of the total time was taken by this type of operation.
+
+- Cumulative time taken by this and op types higher in the table, so you can
+ understand the distribution of the workload.
+
+- How much memory the outputs of this op type took up.
+
+Both of these tables are set up so that you can easily copy and paste their
+results into spreadsheet documents, since they are output with tabs as
+separators between the columns. The summary by node type can be the most useful
+when looking for optimization opportunities, since it’s a pointer to the code
+that’s taking the most time. In this case, you can see that the Conv2D ops are
+almost 90% of the execution time. This is a sign that the graph is pretty
+optimal, since convolutions and matrix multiplies are expected to be the bulk of
+a neural network’s computing workload.
+
+As a rule of thumb, it’s more worrying if you see a lot of other operations
+taking up more than a small fraction of the time. For neural networks, the ops
+that don’t involve large matrix multiplications should usually be dwarfed by the
+ones that do, so if you see a lot of time going into those it’s a sign that
+either your network is non-optimally constructed, or the code implementing those
+ops is not as optimized as it could
+be. [Performance bugs](https://github.com/tensorflow/tensorflow/issues) or
+patches are always welcome if you do encounter this situation, especially if
+they include an attached model exhibiting this behavior and the command line
+used to run the benchmark tool on it.
+
+The run above was on your desktop, but the tool also works on Android, which is
+where it’s most useful for mobile development. Here’s an example command line to
+run it on a 64-bit ARM device:
+
+ bazel build -c opt --config=android_arm64 \
+ tensorflow/tools/benchmark:benchmark_model
+ adb push bazel-bin/tensorflow/tools/benchmark/benchmark_model /data/local/tmp
+ adb push /tmp/tensorflow_inception_graph.pb /data/local/tmp/
+ adb shell '/data/local/tmp/benchmark_model \
+ --graph=/data/local/tmp/tensorflow_inception_graph.pb --input_layer="Mul" \
+ --input_layer_shape="1,299,299,3" --input_layer_type="float" \
+ --output_layer="softmax:0" --show_run_order=false --show_time=false \
+ --show_memory=false --show_summary=true'
+
+You can interpret the results in exactly the same way as the desktop version
+above. If you have any trouble figuring out what the right input and output
+names and types are, take a look at the
+Preparing models
+page for details about detecting these for your model, and look at the
+`summarize_graph` tool which may give you
+helpful information.
+
+There isn’t good support for command line tools on iOS, so instead there’s a
+separate example
+at
+[tensorflow/examples/ios/benchmark](https://www.tensorflow.org/code/tensorflow/examples/ios/benchmark) that
+packages the same functionality inside a standalone app. This outputs the
+statistics to both the screen of the device and the debug log. If you want
+on-screen statistics for the Android example apps, you can turn them on by
+pressing the volume-up button.
+
+## Profiling within your own app
+
+The output you see from the benchmark tool is generated from modules that are
+included as part of the standard TensorFlow runtime, which means you have access
+to them within your own applications too. You can see an example of how to do
+that [here](https://www.tensorflow.org/code/tensorflow/examples/ios/benchmark/BenchmarkViewController.mm?l=139).
+
+The basic steps are:
+
+1. Create a StatSummarizer object:
+
+ tensorflow::StatSummarizer stat_summarizer(tensorflow_graph);
+
+2. Set up the options:
+
+ tensorflow::RunOptions run_options;
+ run_options.set_trace_level(tensorflow::RunOptions::FULL_TRACE);
+ tensorflow::RunMetadata run_metadata;
+
+3. Run the graph:
+
+ run_status = session->Run(run_options, inputs, output_layer_names, {},
+ output_layers, &run_metadata);
+
+4. Calculate the results and print them out:
+
+ assert(run_metadata.has_step_stats());
+ const tensorflow::StepStats& step_stats = run_metadata.step_stats();
+ stat_summarizer->ProcessStepStats(step_stats);
+ stat_summarizer->PrintStepStats();
+
+## Visualizing Models
+
+The most effective way to speed up your code is by altering your model so it
+does less work. To do that, you need to understand what your model is doing, and
+visualizing it is a good first step. To get a high-level overview of your graph,
+use [TensorBoard](https://github.com/tensorflow/tensorboard).
+
+## Threading
+
+The desktop version of TensorFlow has a sophisticated threading model, and will
+try to run multiple operations in parallel if it can. In our terminology this is
+called “inter-op parallelism” (though to avoid confusion with “intra-op”, you
+could think of it as “between-op” instead), and can be set by specifying
+`inter_op_parallelism_threads` in the session options.
+
+By default, mobile devices run operations serially; that is,
+`inter_op_parallelism_threads` is set to 1. Mobile processors usually have few
+cores and a small cache, so running multiple operations accessing disjoint parts
+of memory usually doesn’t help performance. “Intra-op parallelism” (or
+“within-op”) can be very helpful though, especially for computation-bound
+operations like convolutions where different threads can feed off the same small
+set of memory.
+
+On mobile, how many threads an op will use is set to the number of cores by
+default, or 2 when the number of cores can't be determined. You can override the
+default number of threads that ops are using by setting
+`intra_op_parallelism_threads` in the session options. It’s a good idea to
+reduce the default if your app has its own threads doing heavy processing, so
+that they don’t interfere with each other.
+
+To see more details on session options, look at [ConfigProto](https://www.tensorflow.org/code/tensorflow/core/protobuf/config.proto).
+
+## Retrain with mobile data
+
+The biggest cause of accuracy problems when running models on mobile apps is
+unrepresentative training data. For example, most of the Imagenet photos are
+well-framed so that the object is in the center of the picture, well-lit, and
+shot with a normal lens. Photos from mobile devices are often poorly framed,
+badly lit, and can have fisheye distortions, especially selfies.
+
+The solution is to expand your training set with data actually captured from
+your application. This step can involve extra work, since you’ll have to label
+the examples yourself, but even if you just use it to expand your original
+training data, it can help the training set dramatically. Improving the training
+set by doing this, and by fixing other quality issues like duplicates or badly
+labeled examples is the single best way to improve accuracy. It’s usually a
+bigger help than altering your model architecture or using different techniques.
+
+## Reducing model loading time and/or memory footprint
+
+Most operating systems allow you to load a file using memory mapping, rather
+than going through the usual I/O APIs. Instead of allocating an area of memory
+on the heap and then copying bytes from disk into it, you simply tell the
+operating system to make the entire contents of a file appear directly in
+memory. This has several advantages:
+
+* Speeds loading
+* Reduces paging (increases performance)
+* Does not count towards RAM budget for your app
+
+TensorFlow has support for memory mapping the weights that form the bulk of most
+model files. Because of limitations in the `ProtoBuf` serialization format, we
+have to make a few changes to our model loading and processing code. The
+way memory mapping works is that we have a single file where the first part is a
+normal `GraphDef` serialized into the protocol buffer wire format, but then the
+weights are appended in a form that can be directly mapped.
+
+To create this file, run the
+`tensorflow/contrib/util:convert_graphdef_memmapped_format` tool. This takes in
+a `GraphDef` file that’s been run through `freeze_graph` and converts it to the
+format that has the weights appended at the end. Since that file’s no longer a
+standard `GraphDef` protobuf, you then need to make some changes to the loading
+code. You can see an example of this in
+the
+[iOS Camera demo app](https://www.tensorflow.org/code/tensorflow/examples/ios/camera/tensorflow_utils.mm?l=147),
+in the `LoadMemoryMappedModel()` function.
+
+The same code (with the Objective C calls for getting the filenames substituted)
+can be used on other platforms too. Because we’re using memory mapping, we need
+to start by creating a special TensorFlow environment object that’s set up with
+the file we’ll be using:
+
+ std::unique_ptr memmapped_env;
+ memmapped_env->reset(
+ new tensorflow::MemmappedEnv(tensorflow::Env::Default()));
+ tensorflow::Status mmap_status =
+ (memmapped_env->get())->InitializeFromFile(file_path);
+
+You then need to pass in this environment to subsequent calls, like this one for
+loading the graph:
+
+ tensorflow::GraphDef tensorflow_graph;
+ tensorflow::Status load_graph_status = ReadBinaryProto(
+ memmapped_env->get(),
+ tensorflow::MemmappedFileSystem::kMemmappedPackageDefaultGraphDef,
+ &tensorflow_graph);
+
+You also need to create the session with a pointer to the environment you’ve
+created:
+
+ tensorflow::SessionOptions options;
+ options.config.mutable_graph_options()
+ ->mutable_optimizer_options()
+ ->set_opt_level(::tensorflow::OptimizerOptions::L0);
+ options.env = memmapped_env->get();
+
+ tensorflow::Session* session_pointer = nullptr;
+ tensorflow::Status session_status =
+ tensorflow::NewSession(options, &session_pointer);
+
+One thing to notice here is that we’re also disabling automatic optimizations,
+since in some cases these will fold constant sub-trees, and so create copies of
+tensor values that we don’t want and use up more RAM.
+
+Once you’ve gone through these steps, you can use the session and graph as
+normal, and you should see a reduction in loading time and memory usage.
+
+## Protecting model files from easy copying
+
+By default, your models will be stored in the standard serialized protobuf
+format on disk. In theory this means that anybody can copy your model, which you
+may not want. However, in practice, most models are so application-specific and
+obfuscated by optimizations that the risk is similar to that of competitors
+disassembling and reusing your code, but if you do want to make it tougher for
+casual users to access your files it is possible to take some basic steps.
+
+Most of our examples use
+the
+[ReadBinaryProto()](https://www.tensorflow.org/code/tensorflow/core/platform/env.cc?q=core/platform/env.cc&l=409) convenience
+call to load a `GraphDef` from disk. This does require an unencrypted protobuf on
+disk. Luckily though, the implementation of the call is pretty straightforward
+and it should be easy to write an equivalent that can decrypt in memory. Here's
+some code that shows how you can read and decrypt a protobuf using your own
+decryption routine:
+
+ Status ReadEncryptedProto(Env* env, const string& fname,
+ ::tensorflow::protobuf::MessageLite* proto) {
+ string data;
+ TF_RETURN_IF_ERROR(ReadFileToString(env, fname, &data));
+
+ DecryptData(&data); // Your own function here.
+
+ if (!proto->ParseFromString(&data)) {
+ TF_RETURN_IF_ERROR(stream->status());
+ return errors::DataLoss("Can't parse ", fname, " as binary proto");
+ }
+ return Status::OK();
+ }
+
+To use this you’d need to define the DecryptData() function yourself. It could
+be as simple as something like:
+
+ void DecryptData(string* data) {
+ for (int i = 0; i < data.size(); ++i) {
+ data[i] = data[i] ^ 0x23;
+ }
+ }
+
+You may want something more complex, but exactly what you’ll need is outside the
+current scope here.
diff --git a/community/en/docs/tfmobile/prepare_models.md b/community/en/docs/tfmobile/prepare_models.md
new file mode 100644
index 00000000000..cd82a148b53
--- /dev/null
+++ b/community/en/docs/tfmobile/prepare_models.md
@@ -0,0 +1,319 @@
+# Preparing models for mobile deployment
+
+Warning: TensorFlow Mobile is __deprecated__.
+
+
+
+ TensorFlow Lite is our main
+ mobile and embedded offering. We are
+ working hard to close the feature gap between TensorFlow Mobile and
+ TensorFlow Lite. We expect to deprecate TensorFlow Mobile in early 2019. We
+ will give ample notice to our users when we get to that point and will
+ provide help and support to ensure easy migrations.
+
+
+ In the meantime, please use TensorFlow Lite. If you have a feature request,
+ such as a missing op, please post to our GitHub.
+
+
+
+The requirements for storing model information during training are very
+different from when you want to release it as part of a mobile app. This section
+covers the tools involved in converting from a training model to something
+releasable in production.
+
+## What is up with all the different saved file formats?
+
+You may find yourself getting very confused by all the different ways that
+TensorFlow can save out graphs. To help, here’s a rundown of some of the
+different components, and what they are used for. The objects are mostly defined
+and serialized as protocol buffers:
+
+- [NodeDef](https://www.tensorflow.org/code/tensorflow/core/framework/node_def.proto):
+ Defines a single operation in a model. It has a unique name, a list of the
+ names of other nodes it pulls inputs from, the operation type it implements
+ (for example `Add`, or `Mul`), and any attributes that are needed to control
+ that operation. This is the basic unit of computation for TensorFlow, and all
+ work is done by iterating through a network of these nodes, applying each one
+ in turn. One particular operation type that’s worth knowing about is `Const`,
+ since this holds information about a constant. This may be a single, scalar
+ number or string, but it can also hold an entire multi-dimensional tensor
+ array. The values for a `Const` are stored inside the `NodeDef`, and so large
+ constants can take up a lot of room when serialized.
+
+- [Checkpoint](https://www.tensorflow.org/code/tensorflow/core/util/tensor_bundle/tensor_bundle.h). Another
+ way of storing values for a model is by using `Variable` ops. Unlike `Const`
+ ops, these don’t store their content as part of the `NodeDef`, so they take up
+ very little space within the `GraphDef` file. Instead their values are held in
+ RAM while a computation is running, and then saved out to disk as checkpoint
+ files periodically. This typically happens as a neural network is being
+ trained and weights are updated, so it’s a time-critical operation, and it may
+ happen in a distributed fashion across many workers, so the file format has to
+ be both fast and flexible. They are stored as multiple checkpoint files,
+ together with metadata files that describe what’s contained within the
+ checkpoints. When you’re referring to a checkpoint in the API (for example
+ when passing a filename in as a command line argument), you’ll use the common
+ prefix for a set of related files. If you had these files:
+
+ /tmp/model/model-chkpt-1000.data-00000-of-00002
+ /tmp/model/model-chkpt-1000.data-00001-of-00002
+ /tmp/model/model-chkpt-1000.index
+ /tmp/model/model-chkpt-1000.meta
+
+ You would refer to them as `/tmp/model/chkpt-1000`.
+
+- [GraphDef](https://www.tensorflow.org/code/tensorflow/core/framework/graph.proto):
+ Has a list of `NodeDefs`, which together define the computational graph to
+ execute. During training, some of these nodes will be `Variables`, and so if
+ you want to have a complete graph you can run, including the weights, you’ll
+ need to call a restore operation to pull those values from
+ checkpoints. Because checkpoint loading has to be flexible to deal with all of
+ the training requirements, this can be tricky to implement on mobile and
+ embedded devices, especially those with no proper file system available like
+ iOS. This is where
+ the
+ [`freeze_graph.py`](https://www.tensorflow.org/code/tensorflow/python/tools/freeze_graph.py) script
+ comes in handy. As mentioned above, `Const` ops store their values as part of
+ the `NodeDef`, so if all the `Variable` weights are converted to `Const` nodes,
+ then we only need a single `GraphDef` file to hold the model architecture and
+ the weights. Freezing the graph handles the process of loading the
+ checkpoints, and then converts all Variables to Consts. You can then load the
+ resulting file in a single call, without having to restore variable values
+ from checkpoints. One thing to watch out for with `GraphDef` files is that
+ sometimes they’re stored in text format for easy inspection. These versions
+ usually have a ‘.pbtxt’ filename suffix, whereas the binary files end with
+ ‘.pb’.
+
+- [FunctionDefLibrary](https://www.tensorflow.org/code/tensorflow/core/framework/function.proto):
+ This appears in `GraphDef`, and is effectively a set of sub-graphs, each with
+ information about their input and output nodes. Each sub-graph can then be
+ used as an op in the main graph, allowing easy instantiation of different
+ nodes, in a similar way to how functions encapsulate code in other languages.
+
+- [MetaGraphDef](https://www.tensorflow.org/code/tensorflow/core/protobuf/meta_graph.proto):
+ A plain `GraphDef` only has information about the network of computations, but
+ doesn’t have any extra information about the model or how it can be
+ used. `MetaGraphDef` contains a `GraphDef` defining the computation part of
+ the model, but also includes information like ‘signatures’, which are
+ suggestions about which inputs and outputs you may want to call the model
+ with, data on how and where any checkpoint files are saved, and convenience
+ tags for grouping ops together for ease of use.
+
+- [SavedModel](https://www.tensorflow.org/code/tensorflow/core/protobuf/saved_model.proto):
+ It’s common to want to have different versions of a graph that rely on a
+ common set of variable checkpoints. For example, you might need a GPU and a
+ CPU version of the same graph, but keep the same weights for both. You might
+ also need some extra files (like label names) as part of your
+ model. The
+ [SavedModel](https://www.tensorflow.org/code/tensorflow/python/saved_model/README.md) format
+ addresses these needs by letting you save multiple versions of the same graph
+ without duplicating variables, and also storing asset files in the same
+ bundle. Under the hood, it uses `MetaGraphDef` and checkpoint files, along
+ with extra metadata files. It’s the format that you’ll want to use if you’re
+ deploying a web API using TensorFlow Serving, for example.
+
+## How do you get a model you can use on mobile?
+
+In most situations, training a model with TensorFlow will give you a folder
+containing a `GraphDef` file (usually ending with the `.pb` or `.pbtxt` extension) and
+a set of checkpoint files. What you need for mobile or embedded deployment is a
+single `GraphDef` file that’s been ‘frozen’, or had its variables converted into
+inline constants so everything’s in one file. To handle the conversion, you’ll
+need the `freeze_graph.py` script, that’s held in
+[`tensorflow/python/tools/freeze_graph.py`](https://www.tensorflow.org/code/tensorflow/python/tools/freeze_graph.py). You’ll run it like this:
+
+ bazel build tensorflow/python/tools:freeze_graph
+ bazel-bin/tensorflow/python/tools/freeze_graph \
+ --input_graph=/tmp/model/my_graph.pb \
+ --input_checkpoint=/tmp/model/model.ckpt-1000 \
+ --output_graph=/tmp/frozen_graph.pb \
+ --output_node_names=output_node \
+
+The `input_graph` argument should point to the `GraphDef` file that holds your
+model architecture. It’s possible that your `GraphDef` has been stored in a text
+format on disk, in which case it’s likely to end in `.pbtxt` instead of `.pb`,
+and you should add an extra `--input_binary=false` flag to the command.
+
+The `input_checkpoint` should be the most recent saved checkpoint. As mentioned
+in the checkpoint section, you need to give the common prefix to the set of
+checkpoints here, rather than a full filename.
+
+`output_graph` defines where the resulting frozen `GraphDef` will be
+saved. Because it’s likely to contain a lot of weight values that take up a
+large amount of space in text format, it’s always saved as a binary protobuf.
+
+`output_node_names` is a list of the names of the nodes that you want to extract
+the results of your graph from. This is needed because the freezing process
+needs to understand which parts of the graph are actually needed, and which are
+artifacts of the training process, like summarization ops. Only ops that
+contribute to calculating the given output nodes will be kept. If you know how
+your graph is going to be used, these should just be the names of the nodes you
+pass into `Session::Run()` as your fetch targets. The easiest way to find the
+node names is to inspect the Node objects while building your graph in python.
+Inspecting your graph in TensorBoard is another simple way. You can get some
+suggestions on likely outputs by running the [`summarize_graph` tool](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms/README.md#inspecting-graphs).
+
+Because the output format for TensorFlow has changed over time, there are a
+variety of other less commonly used flags available too, like `input_saver`, but
+hopefully you shouldn’t need these on graphs trained with modern versions of the
+framework.
+
+## Using the Graph Transform Tool
+
+A lot of the things you need to do to efficiently run a model on device are
+available through the [Graph Transform
+Tool](https://www.tensorflow.org/code/tensorflow/tools/graph_transforms/README.md). This
+command-line tool takes an input `GraphDef` file, applies the set of rewriting
+rules you request, and then writes out the result as a `GraphDef`. See the
+documentation for more information on how to build and run this tool.
+
+### Removing training-only nodes
+
+TensorFlow `GraphDefs` produced by the training code contain all of the
+computation that’s needed for back-propagation and updates of weights, as well
+as the queuing and decoding of inputs, and the saving out of checkpoints. All of
+these nodes are no longer needed during inference, and some of the operations
+like checkpoint saving aren’t even supported on mobile platforms. To create a
+model file that you can load on devices you need to delete those unneeded
+operations by running the `strip_unused_nodes` rule in the Graph Transform Tool.
+
+The trickiest part of this process is figuring out the names of the nodes you
+want to use as inputs and outputs during inference. You'll need these anyway
+once you start to run inference, but you also need them here so that the
+transform can calculate which nodes are not needed on the inference-only
+path. These may not be obvious from the training code. The easiest way to
+determine the node name is to explore the graph with TensorBoard.
+
+Remember that mobile applications typically gather their data from sensors and
+have it as arrays in memory, whereas training typically involves loading and
+decoding representations of the data stored on disk. In the case of Inception v3
+for example, there’s a `DecodeJpeg` op at the start of the graph that’s designed
+to take JPEG-encoded data from a file retrieved from disk and turn it into an
+arbitrary-sized image. After that there’s a `BilinearResize` op to scale it to
+the expected size, followed by a couple of other ops that convert the byte data
+into float and scale the value magnitudes it in the way the rest of the graph
+expects. A typical mobile app will skip most of these steps because it’s getting
+its input directly from a live camera, so the input node you will actually
+supply will be the output of the `Mul` node in this case.
+
+
+
+You’ll need to do a similar process of inspection to figure out the correct
+output nodes.
+
+If you’ve just been given a frozen `GraphDef` file, and are not sure about the
+contents, try using the `summarize_graph` tool to print out information
+about the inputs and outputs it finds from the graph structure. Here’s an
+example with the original Inception v3 file:
+
+ bazel run tensorflow/tools/graph_transforms:summarize_graph --
+ --in_graph=tensorflow_inception_graph.pb
+
+Once you have an idea of what the input and output nodes are, you can feed them
+into the graph transform tool as the `--input_names` and `--output_names`
+arguments, and call the `strip_unused_nodes` transform, like this:
+
+ bazel run tensorflow/tools/graph_transforms:transform_graph --
+ --in_graph=tensorflow_inception_graph.pb
+ --out_graph=optimized_inception_graph.pb --inputs='Mul' --outputs='softmax'
+ --transforms='
+ strip_unused_nodes(type=float, shape="1,299,299,3")
+ fold_constants(ignore_errors=true)
+ fold_batch_norms
+ fold_old_batch_norms'
+
+One thing to look out for here is that you need to specify the size and type
+that you want your inputs to be. This is because any values that you’re going to
+be passing in as inputs to inference need to be fed to special `Placeholder` op
+nodes, and the transform may need to create them if they don’t already exist. In
+the case of Inception v3 for example, a `Placeholder` node replaces the old
+`Mul` node that used to output the resized and rescaled image array, since we’re
+going to be doing that processing ourselves before we call TensorFlow. It keeps
+the original name though, which is why we always feed in inputs to `Mul` when we
+run a session with our modified Inception graph.
+
+After you’ve run this process, you’ll have a graph that only contains the actual
+nodes you need to run your prediction process. This is the point where it
+becomes useful to run metrics on the graph, so it’s worth running
+`summarize_graph` again to understand what’s in your model.
+
+## What ops should you include on mobile?
+
+There are hundreds of operations available in TensorFlow, and each one has
+multiple implementations for different data types. On mobile platforms, the size
+of the executable binary that’s produced after compilation is important, because
+app download bundles need to be as small as possible for the best user
+experience. If all of the ops and data types are compiled into the TensorFlow
+library then the total size of the compiled library can be tens of megabytes, so
+by default only a subset of ops and data types are included.
+
+That means that if you load a model file that’s been trained on a desktop
+machine, you may see the error “No OpKernel was registered to support Op” when
+you load it on mobile. The first thing to try is to make sure you’ve stripped
+out any training-only nodes, since the error will occur at load time even if the
+op is never executed. If you’re still hitting the same problem once that’s done,
+you’ll need to look at adding the op to your built library.
+
+The criteria for including ops and types fall into several categories:
+
+- Are they only useful in back-propagation, for gradients? Since mobile is
+ focused on inference, we don’t include these.
+
+- Are they useful mainly for other training needs, such as checkpoint saving?
+ These we leave out.
+
+- Do they rely on frameworks that aren’t always available on mobile, such as
+ libjpeg? To avoid extra dependencies we don’t include ops like `DecodeJpeg`.
+
+- Are there types that aren’t commonly used? We don’t include boolean variants
+ of ops for example, since we don’t see much use of them in typical inference
+ graphs.
+
+These ops are trimmed by default to optimize for inference on mobile, but it is
+possible to alter some build files to change the default. After alternating the
+build files, you will need to recompile TensorFlow. See below for more details
+on how to do this, and also see optimizing binary size
+for more on reducing your binary size.
+
+### Locate the implementation
+
+Operations are broken into two parts. The first is the op definition, which
+declares the signature of the operation, which inputs, outputs, and attributes
+it has. These take up very little space, and so all are included by default. The
+implementations of the op computations are done in kernels, which live in the
+`tensorflow/core/kernels` folder. You need to compile the C++ file containing
+the kernel implementation of the op you need into the library. To figure out
+which file that is, you can search for the operation name in the source
+files.
+
+[Here’s an example search in github](https://github.com/search?utf8=%E2%9C%93&q=repo%3Atensorflow%2Ftensorflow+extension%3Acc+path%3Atensorflow%2Fcore%2Fkernels+REGISTER+Mul&type=Code&ref=searchresults).
+
+You’ll see that this search is looking for the `Mul` op implementation, and it
+finds it in `tensorflow/core/kernels/cwise_op_mul_1.cc`. You need to look for
+macros beginning with `REGISTER`, with the op name you care about as one of the
+string arguments.
+
+In this case, the implementations are actually broken up across multiple `.cc`
+files, so you’d need to include all of them in your build. If you’re more
+comfortable using the command line for code search, here’s a grep command that
+also locates the right files if you run it from the root of your TensorFlow
+repository:
+
+`grep 'REGISTER.*"Mul"' tensorflow/core/kernels/*.cc`
+
+### Add the implementation to the build
+
+If you’re using Bazel, and building for Android, you’ll want to add the files
+you’ve found to
+the
+[`android_extended_ops_group1`](https://www.tensorflow.org/code/tensorflow/core/kernels/BUILD#L3565) or
+[`android_extended_ops_group2`](https://www.tensorflow.org/code/tensorflow/core/kernels/BUILD#L3632) targets. You
+may also need to include any .cc files they depend on in there. If the build
+complains about missing header files, add the .h’s that are needed into
+the
+[`android_extended_ops`](https://www.tensorflow.org/code/tensorflow/core/kernels/BUILD#L3525) target.
+
+If you’re using a makefile targeting iOS, Raspberry Pi, etc, go to
+[`tensorflow/contrib/makefile/tf_op_files.txt`](https://www.tensorflow.org/code/tensorflow/contrib/makefile/tf_op_files.txt) and
+add the right implementation files there.