intel
diff --git a/‎CHANGELOG.md‎
Lines changed: 2 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎docs/index.rst‎
Lines changed: 37 additions & 10 deletions b/‎docs/index.rst‎
Lines changed: 37 additions & 10 deletions
diff --git a/‎docs/plugins/dffml_model.rst‎
Lines changed: 175 additions & 1 deletion b/‎docs/plugins/dffml_model.rst‎
Lines changed: 175 additions & 1 deletion
diff --git a/‎docs/plugins/dffml_operation_implementation.rst‎
Lines changed: 22 additions & 2 deletions b/‎docs/plugins/dffml_operation_implementation.rst‎
Lines changed: 22 additions & 2 deletions
diff --git a/‎docs/plugins/dffml_service_cli.rst‎
Lines changed: 28 additions & 0 deletions b/‎docs/plugins/dffml_service_cli.rst‎
Lines changed: 28 additions & 0 deletions
diff --git a/‎docs/plugins/dffml_source.rst‎
Lines changed: 5 additions & 0 deletions b/‎docs/plugins/dffml_source.rst‎
Lines changed: 5 additions & 0 deletions
@@ -22,6 +22,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - Skeleton for service creation was added
 - Simple Linear Regression model from scratch
 - Community link in CONTRIBUTING.md.
+- Explained three main parts of DFFML on docs homepage
+- Documentation on how to use ML models on docs Models plugin page.
 ### Changed
 - feature/codesec became it's own branch, binsec
 - BaseOrchestratorContext `run_operations` strict is default to true. With
 
@@ -10,18 +10,32 @@ Data Flow Facilitator for Machine Learning (DFFML) provides APIs for dataset
 generation and storage, and model definition using any machine learning
 framework, from high level down to low level use is supported.
 
-The goal of DFFML is to build a community driven library of plugins for dataset
-generation and model definition. So that we as developers and researchers can
-quickly and easily plug and play various pieces of data with various model
-implementations.
+The idea of DFFML is to abstract three main parts of the machine learning
+workflow. So as to reduce the amount of code that gets re-writen when applying
+machine learning to a new problem.
 
-The more we build up the library of plugins (which anyone can maintain, they
-don't have to be contributed upstream unless you want to) the more variations on
-model implementations, feature data generators, and database backend
-abstractions, we all have to work with.
+It's an object oriented approach involving three main classes.
+
+- ``Source`` classes handle the storage of datasets, saving and loading them
+  from files, databases, remote APIs, etc.
+
+
+- ``Model`` classes handle implementations of machine learning algorithms. They
+  most likely implement them using a machine learning framework. DFFML is not a
+  machine learning library like PyTorch or TensorFlow. It's higher level than
+  those. Because of this, you most likely you won't have to write any code to
+  start doing machine learning. If you want to fine tune a model or create your
+  own specific implementation, all you need to do is subclass from ``Model``.
+
+
+  - To get started with machine learning right away, head over to
+    :ref:`plugin_models`.
+
+
+- ``OperationImplementation`` classes are akin to micro services, wrapped in
+  a data flow architecture. More information on these can be found in the Data
+  Flow usage example. *The data flow portion of the API is less mature*.
 
-Right now we've released a wrapper around the Tensorflow DNN estimator, and a
-set of feature generators which gather data from git repositories.
 
 .. toctree::
    :glob:
@@ -34,6 +48,19 @@ set of feature generators which gather data from git repositories.
    plugins/index
    api/index
 
+The goal of DFFML is to build a community driven library of plugins for dataset
+generation and model definition. So that we as developers and researchers can
+quickly and easily plug and play various pieces of data with various model
+implementations.
+
+The more we build up the library of plugins (which anyone can maintain, they
+don't have to be contributed upstream unless you want to) the more variations on
+model implementations, feature data generators, and database backend
+abstractions, we all have to work with.
+
+Right now we've released a wrapper around the Tensorflow DNN classifier, a
+simple linear regression estimator, and a set of operations which gather data
+from git repositories.
 
 Indices and tables
 ==================
 
@@ -1,3 +1,5 @@
+.. _plugin_models:
+
 Models
 ======
 
@@ -7,6 +9,11 @@ abstract the usage of machine learning models.
 dffml_model_tensorflow
 ----------------------
 
+.. code-block:: console
+
+    pip install dffml-model-tensorflow
+
+
 tfdnnc
 ~~~~~~
 
@@ -15,6 +22,93 @@ tfdnnc
 Implemented using Tensorflow's DNNClassifier. Models are saved under the
 ``directory`` in subdirectories named after the hash of their feature names.
 
+.. code-block:: console
+
+    $ wget http://download.tensorflow.org/data/iris_training.csv
+    $ wget http://download.tensorflow.org/data/iris_test.csv
+    $ head iris_training.csv
+    $ sed -i 's/.*setosa,versicolor,virginica/SepalLength,SepalWidth,PetalLength,PetalWidth,classification/g' *.csv
+    $ head iris_training.csv
+    $ dffml train \
+      -model tfdnnc \
+      -model-epochs 3000 \
+      -model-steps 20000 \
+      -model-classification classification \
+      -model-classifications 0 1 2 \
+      -model-clstype int \
+      -sources iris=csv \
+      -source-filename iris_training.csv \
+      -features \
+        def:SepalLength:float:1 \
+        def:SepalWidth:float:1 \
+        def:PetalLength:float:1 \
+        def:PetalWidth:float:1 \
+      -log debug
+    ... lots of output ...
+    $ dffml accuracy \
+      -model tfdnnc \
+      -model-classification classification \
+      -model-classifications 0 1 2 \
+      -model-clstype int \
+      -sources iris=csv \
+      -source-filename iris_test.csv \
+      -features \
+        def:SepalLength:float:1 \
+        def:SepalWidth:float:1 \
+        def:PetalLength:float:1 \
+        def:PetalWidth:float:1 \
+      -log critical
+    0.99996233782
+    $ dffml predict all \
+      -model tfdnnc \
+      -model-classification classification \
+      -model-classifications 0 1 2 \
+      -model-clstype int \
+      -sources iris=csv \
+      -source-filename iris_test.csv \
+      -features \
+        def:SepalLength:float:1 \
+        def:SepalWidth:float:1 \
+        def:PetalLength:float:1 \
+        def:PetalWidth:float:1 \
+      -caching \
+      -log critical \
+      > results.json
+    $ head -n 33 results.json
+    [
+        {
+            "extra": {},
+            "features": {
+                "PetalLength": 4.2,
+                "PetalWidth": 1.5,
+                "SepalLength": 5.9,
+                "SepalWidth": 3.0,
+                "classification": 1
+            },
+            "last_updated": "2019-07-31T02:00:12Z",
+            "prediction": {
+                "confidence": 0.9999997615814209,
+                "value": 1
+            },
+            "src_url": "0"
+        },
+        {
+            "extra": {},
+            "features": {
+                "PetalLength": 5.4,
+                "PetalWidth": 2.1,
+                "SepalLength": 6.9,
+                "SepalWidth": 3.1,
+                "classification": 2
+            },
+            "last_updated": "2019-07-31T02:00:12Z",
+            "prediction": {
+                "confidence": 0.9999984502792358,
+                "value": 2
+            },
+            "src_url": "1"
+        },
+
 **Args**
 
 - directory: String
@@ -48,4 +142,84 @@ Implemented using Tensorflow's DNNClassifier. Models are saved under the
 - clstype: locate
 
   - default: <class 'str'>
-  - Data type of classifications values (default: str)
+  - Data type of classifications values (default: str)
+
+dffml_model_scratch
+-------------------
+
+.. code-block:: console
+
+    pip install dffml-model-scratch
+
+
+scratchslr
+~~~~~~~~~~
+
+*Core*
+
+Simple Linear Regression Model for 2 variables implemented from scratch.
+Models are saved under the ``directory`` in subdirectories named after the
+hash of their feature names.
+
+.. code-block:: console
+
+    $ cat > dataset.csv << EOF
+    Years,Salary
+    1,40
+    2,50
+    3,60
+    4,70
+    5,80
+    EOF
+    $ dffml train \
+        -model scratchslr \
+        -features def:Years:int:1 \
+        -model-predict Salary \
+        -sources f=csv \
+        -source-filename dataset.csv \
+        -source-readonly \
+        -log debug
+    $ dffml accuracy \
+        -model scratchslr \
+        -features def:Years:int:1 \
+        -model-predict Salary \
+        -sources f=csv \
+        -source-filename dataset.csv \
+        -source-readonly \
+        -log debug
+    1.0
+    $ echo -e 'Years,Salary\n6,0\n' | \
+      dffml predict all \
+        -model scratchslr \
+        -features def:Years:int:1 \
+        -model-predict Salary \
+        -sources f=csv \
+        -source-filename /dev/stdin \
+        -source-readonly \
+        -log debug
+    [
+        {
+            "extra": {},
+            "features": {
+                "Salary": 0,
+                "Years": 6
+            },
+            "last_updated": "2019-07-19T09:46:45Z",
+            "prediction": {
+                "confidence": 1.0,
+                "value": 90.0
+            },
+            "src_url": "0"
+        }
+    ]
+
+**Args**
+
+- directory: String
+
+  - default: /home/user/.cache/dffml/scratch
+  - Directory where state should be saved
+
+- predict: String
+
+  - Label or the value to be predicted
@@ -8,6 +8,11 @@ which could do anything, make HTTP requests, do inference, etc.
 dffml
 -----
 
+.. code-block:: console
+
+    pip install dffml
+
+
 associate
 ~~~~~~~~~
 
@@ -72,6 +77,11 @@ No description
 dffml_feature_git
 -----------------
 
+.. code-block:: console
+
+    pip install dffml-feature-git
+
+
 check_if_valid_git_repository_URL
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -353,8 +363,13 @@ No description
 
 - work: work_spread(type: int)
 
-dffml_feature_codesec
----------------------
+dffml_operations_binsec
+-----------------------
+
+.. code-block:: console
+
+    pip install dffml-operations-binsec
+
 
 cleanup_rpm
 ~~~~~~~~~~~
@@ -470,6 +485,11 @@ No description
 dffml_feature_auth
 ------------------
 
+.. code-block:: console
+
+    pip install dffml-feature-auth
+
+
 scrypt
 ~~~~~~
 
 
@@ -14,9 +14,37 @@ Services. It also helps developers hack on DFFML itself.
 
     $ dffml service -h
 
+You can create a new python package and start implementing a new plugin for
+DFFML right away with the ``create`` command of ``dev``.
+
+.. code-block:: console
+
+    $ dffml service dev create model cool-ml-model
+    $ cd cool-ml-model
+    $ python setup.py test
+
+When you're done you can upload it to PyPi and it'll be ``pip`` installable so
+that other DFFML users can use it in their code or via the CLI. If you don't
+want to mess with uploading to ``PyPi``, you can install it from your git repo
+(wherever it may be that you upload it to).
+
+.. code-block:: console
+
+    $ python -m pip install -U git+https://github.com/user/cool-ml-model
+
+Make sure to look in ``setup.py`` and edit the ``entry_points`` to match
+whatever you've edited. This way whatever you make will be usable by others
+within the DFFML CLI (eventually HTTP API and others) as soon as they ``pip``
+install your package, nothing else required.
+
 dffml
 -----
 
+.. code-block:: console
+
+    pip install dffml
+
+
 dev
 ~~~
 
 
@@ -7,6 +7,11 @@ abstract the loading and storage of data / datasets.
 dffml
 -----
 
+.. code-block:: console
+
+    pip install dffml
+
+
 csv
 ~~~