From 1a0bd33427f7c91a25c5f713aff6d2b350a3aedf Mon Sep 17 00:00:00 2001 From: Viacheslav Katsuba Date: Fri, 27 Jul 2018 00:23:36 +0300 Subject: [PATCH] Add tabulation in example config --- .../Getting Started with FastScore/index.md | 90 +++++++++---------- 1 file changed, 45 insertions(+), 45 deletions(-) diff --git a/Getting Started/Getting Started with FastScore/index.md b/Getting Started/Getting Started with FastScore/index.md index 6af9e7b..c9711ea 100644 --- a/Getting Started/Getting Started with FastScore/index.md +++ b/Getting Started/Getting Started with FastScore/index.md @@ -30,10 +30,10 @@ If you need support or have questions, please email us: support@opendatagroup.co ## Installing FastScore -This guide will walk you through installing and running Open Data Group's FastScore microservices suite. The following instructions will generally assume that you're working on a Linux machine. There are slight differences if you're running FastScore on MacOS, which will be indicated by a special note. The differences if you're running Windows have not yet been fully charted. +This guide will walk you through installing and running Open Data Group's FastScore microservices suite. The following instructions will generally assume that you're working on a Linux machine. There are slight differences if you're running FastScore on MacOS, which will be indicated by a special note. The differences if you're running Windows have not yet been fully charted. ### Prerequisites -The FastScore Microservices Suite is hosted on DockerHub ([https://hub.docker.com/u/fastscore/](https://hub.docker.com/u/fastscore/)). As such, [one must first install Docker](https://docs.docker.com/engine/installation/). For example, on Ubuntu Linux: +The FastScore Microservices Suite is hosted on DockerHub ([https://hub.docker.com/u/fastscore/](https://hub.docker.com/u/fastscore/)). As such, [one must first install Docker](https://docs.docker.com/engine/installation/). For example, on Ubuntu Linux: ``` bash $ sudo apt-get install docker.io @@ -63,7 +63,7 @@ It's also useful (recommended but not mandatory) to have Docker Compose installe Once Docker has been installed, there are only a few steps needed to get FastScore running. 1. Start the FastScore services, either manually, or via Docker Compose (recommended). -2. Write a FastScore configuration file. +2. Write a FastScore configuration file. 3. Configure FastScore using the FastScore CLI and configuration file. 4. Connect to the FastScore Dashboard with your browser. @@ -71,7 +71,7 @@ Let's go through each step carefully. ### Start FastScore Microservices Suite with Docker Compose (Recommended) -Docker-Compose is a utility that streamlines the configuration and simultaneous execution of multiple Docker containers. A Docker Compose file is a YAML file defining [services](https://docs.docker.com/compose/compose-file/#service-configuration-reference), [networks](https://docs.docker.com/compose/compose-file/#network-configuration-reference) and [volumes](https://docs.docker.com/compose/compose-file/#volume-configuration-reference). The default path for a Compose file is `./docker-compose.yml`, but custom filenames are supported with the `-f ` flag. The interested reader is directed to [Docker's documentation for more information on Compose files](https://docs.docker.com/compose/compose-file/). +Docker-Compose is a utility that streamlines the configuration and simultaneous execution of multiple Docker containers. A Docker Compose file is a YAML file defining [services](https://docs.docker.com/compose/compose-file/#service-configuration-reference), [networks](https://docs.docker.com/compose/compose-file/#network-configuration-reference) and [volumes](https://docs.docker.com/compose/compose-file/#volume-configuration-reference). The default path for a Compose file is `./docker-compose.yml`, but custom filenames are supported with the `-f ` flag. The interested reader is directed to [Docker's documentation for more information on Compose files](https://docs.docker.com/compose/compose-file/). #### Starting and Stopping the FastScore Database @@ -87,12 +87,12 @@ and stop it with docker volume rm db ``` -(You may name the database whatever you wish---we use the convention `db` in this document). Note that the database should generally be running before starting any FastScore services which use the database. +(You may name the database whatever you wish---we use the convention `db` in this document). Note that the database should generally be running before starting any FastScore services which use the database. Add the database to Model Manage in the usual way for Docker volumes: 1. If using docker-compose, put the volume in the `docker-compose.yml` file(see example file below) -2. If running manually, with the `-v` flag when using `docker run`, e.g., +2. If running manually, with the `-v` flag when using `docker run`, e.g., ```docker run -it -d --net=host --rm -v db:/var/lib/mysql fastscore/model-manage-mysql:1.7 ``` (see more below) #### Example Docker Compose File @@ -150,12 +150,12 @@ services: volumes: db: - external: true + external: true ``` - -Docker Compose files can be used to initialize or halt all of the FastScore microservices components with a single command. -For example, to start microservices using docker-compose: +Docker Compose files can be used to initialize or halt all of the FastScore microservices components with a single command. + +For example, to start microservices using docker-compose: ``` bash docker-compose up -d @@ -167,7 +167,7 @@ Similarly, all of the specified microservices can be stopped with the command docker-compose down -v ``` -(Here, the `-v` flag instructs `docker-compose` to remove any lingering volumes created by the compose file. This does *not* include external Docker volumes, such as those created by `docker volume create`). +(Here, the `-v` flag instructs `docker-compose` to remove any lingering volumes created by the compose file. This does *not* include external Docker volumes, such as those created by `docker volume create`). Check that all the Docker containers are running with the ```docker ps``` command. The output should look something like this: @@ -207,7 +207,7 @@ docker run -it -d --net=host --rm -v db:/var/lib/mysql fastscore/model-manage-my It is additionally useful to install the FastScore Command-Line Interface (CLI). -### Installing the FastScore Command-Line Interface (CLI) +### Installing the FastScore Command-Line Interface (CLI) The FastScore CLI can be downloaded and installed using the following commands: ``` bash @@ -217,7 +217,7 @@ cd fastscore-cli-1.7 sudo python setup.py install ``` -This will install the required dependencies. The FastScore CLI is a Python tool, so it doesn't need to be compiled, and the setup script should automatically add the CLI to `$PATH`. +This will install the required dependencies. The FastScore CLI is a Python tool, so it doesn't need to be compiled, and the setup script should automatically add the CLI to `$PATH`. > `python-setuptools` and `python-dev` (i.e. header files) are required to properly install the FastScore CLI. These may or may not be already present on your system. If not, you will need to install them. @@ -258,7 +258,7 @@ Available commands: Run 'fastscore help ' to get more details on usage ``` -This displays a list of all of the FastScore CLI commands. +This displays a list of all of the FastScore CLI commands. ## Configuring and Starting FastScore @@ -293,28 +293,28 @@ fastscore: topic: notify ``` -Configuration files are written in YAML. The configuration file above specifies the host machines and ports for the Model Manage container, the MySQL database container used by Model Manage, and two Engine containers, all hosted on the same machine. Additionally, Pneumo, an asynchronous notification library used by FastScore, is configured to communicate via Kafka. +Configuration files are written in YAML. The configuration file above specifies the host machines and ports for the Model Manage container, the MySQL database container used by Model Manage, and two Engine containers, all hosted on the same machine. Additionally, Pneumo, an asynchronous notification library used by FastScore, is configured to communicate via Kafka. > The example file above is designed for use on a Linux machine; you will need to modify it for use in other settings (e.g. MacOS). And, as discussed above, you must create the db volume with ```docker volume create db``` (or remove the volume link from `database`) ### Connecting to and Configuring FastScore with the FastScore CLI -Once the FastScore suite of services is running, we have to configure Connect using the file we created earlier. +Once the FastScore suite of services is running, we have to configure Connect using the file we created earlier. -Connect the FastScore CLI to the dashboard using the following command: +Connect the FastScore CLI to the dashboard using the following command: ``` bash $ fastscore connect https://localhost:8000 ``` -Then, use the `config set` command to set the configuration file for Connect: +Then, use the `config set` command to set the configuration file for Connect: ``` bash $ fastscore config set config.yaml ``` -`config.yml` is the configuration file described [earlier in this document](#section-fastscore-configuration-files). +`config.yml` is the configuration file described [earlier in this document](#section-fastscore-configuration-files). -We can then check the status of our containers using the `fleet` command: +We can then check the status of our containers using the `fleet` command: ``` bash $ fastscore fleet @@ -328,11 +328,11 @@ model-manage-1 model-manage ok Now we're ready to start scoring. ## Using the FastScore Dashboard -FastScore's Dashboard provides a convenient user interface for reviewing engine status and managing models, schemas, sensors, and streams. However, as compared to the FastScore CLI, it requires a few additional setup steps to get things running. +FastScore's Dashboard provides a convenient user interface for reviewing engine status and managing models, schemas, sensors, and streams. However, as compared to the FastScore CLI, it requires a few additional setup steps to get things running. First, if you are not running FastScore on your local machine (for example, if you have FastScore running on a cloud service platform), you will need to allow incoming and outgoing traffic on port 8000 (used by the FastScore Dashboard). You will also need to have configured FastScore as described in the previous section. -To access the Dashboard, take your browser to the FastScore host machine at port 8000. If all goes well , you will be greeted by this screen: +To access the Dashboard, take your browser to the FastScore host machine at port 8000. If all goes well , you will be greeted by this screen: ![Home Screen](images/HomeScreen.png) *On the left-hand side of the Dashboard are four sections: engine-1, engine-2, model-manage-1, Connect. These correspond to the Engine microservices, the Model Manage microservice, and the Connect microservice. The green dots on the engines and model manage indicate that they are currently running correctly. If you have configured additional engine containers, they will also appear on the side.* @@ -351,9 +351,9 @@ FastScore is a streaming analytic engine: its core functionality is to read in r ### Creating and Loading Assets into FastScore Model Manage Version 1.7 of FastScore supports models in Python, R, Java, MATLAB, [PFA](http://dmg.org/pfa/), [PrettyPFA](https://github.com/opendatagroup/hadrian/wiki/PrettyPFA-Reference) and C formats. Some setup steps differ slightly between Python/R models and PFA, Java, MATLAB, or C models. As a model interchange format, PFA can provide some benefits in performance, scalability, and security relative to R and Python. PrettyPFA is a human-readable equivalent to PFA. However, as the majority of users will be more familiar with R and Python, we focus on these two languages in this section. - -#### Loading Assets -The FastScore CLI allows a user to load models directly from the command line. The list of models currently loaded in FastScore can be viewed using the model list command: + +#### Loading Assets +The FastScore CLI allows a user to load models directly from the command line. The list of models currently loaded in FastScore can be viewed using the model list command: ``` bash $ fastscore model list @@ -362,7 +362,7 @@ Name Type MyModel Python ``` -Models can be added with `model add `, and removed with `model remove `. Additionally, the `fastscore model show ` command will display the named model. +Models can be added with `model add `, and removed with `model remove `. Additionally, the `fastscore model show ` command will display the named model. #### Models via the Dashboard The Dashboard provides functionality to add and manage models. To upload a model, under the Models tab, select the "Upload model" button, and choose a model from your local machine. Alternatively, "select model", depicted below, allows you to select an existing model from the model manager by name. @@ -381,7 +381,7 @@ All models are added to FastScore and executed using the same CLI commands, name fastscore model add ``` -Note that, in order to determine whether a model is Python or R, Engine requires that it have an appropriate file extension (`.py` for Python, `.R` for R, `.pfa` for PFA, and `.ppfa` for PrettyPFA). Also, in order to score a Python/R model, there are certain constraints on the form the model must take. +Note that, in order to determine whether a model is Python or R, Engine requires that it have an appropriate file extension (`.py` for Python, `.R` for R, `.pfa` for PFA, and `.ppfa` for PrettyPFA). Also, in order to score a Python/R model, there are certain constraints on the form the model must take. FastScore includes both a Python2 and Python3 model runner. By default, `.py` files are interpreted as Python2 models---to load a Python3 model, use the file extension `.py3`, or the flag `-type:python3` option with `fastscore model add`: @@ -416,13 +416,13 @@ def begin(): # perform any initialization needed here global myObject myObject = pickle.load(open('object.pkl')) pass # or do something with the unpickled object - + def action(datum): # datum is expected to be of the form '{"x":5, "y":6}' record = datum x = record['x'] y = record['y'] - yield x + y - + yield x + y + def end(): pass ``` @@ -453,7 +453,7 @@ FastScore enforces strong typing on both the inputs and outputs of its models us # fastscore.output: double ``` -Python and R models *must* specify schemas for their inputs and outputs. PrettyPFA and PFA models already contain the input and output schema as part of the model definition, so they do not require a schema attachment. +Python and R models *must* specify schemas for their inputs and outputs. PrettyPFA and PFA models already contain the input and output schema as part of the model definition, so they do not require a schema attachment. For example, a model that expects to receive records of two doubles as inputs might have the following schema: @@ -487,7 +487,7 @@ fastscore schema add output output.avsc Attachments can also be managed from within the Dashboard, using the Model Manage view. ### Input and Output Streams -Before a model can be run, it has to have some data to run on. Input and output streams are used to supply the incoming data to the model, and to return the corresponding scores. Currently, ten types of stream transports are supported: file, Kafka, Authenticated Kafka, Executable, HTTP, TCP, UDP, ODBC, debug, and console streams. All of these types are configured using a Stream Descriptor file. +Before a model can be run, it has to have some data to run on. Input and output streams are used to supply the incoming data to the model, and to return the corresponding scores. Currently, ten types of stream transports are supported: file, Kafka, Authenticated Kafka, Executable, HTTP, TCP, UDP, ODBC, debug, and console streams. All of these types are configured using a Stream Descriptor file. Stream Descriptors are small JSON files containing information about the stream. An example of a Stream Descriptor for a Kafka stream is displayed below: @@ -521,10 +521,10 @@ Stream descriptors are documented in more detail [on the stream descriptor page] } ``` -This file stream expects each line of the `neural_net_input.jsons` file to be a vector of doubles, encoded as a JSON object, and delimitated by newlines. The file is located in the `/root/data/` directory of the engine container. The `"Loop": false` line tells FastScore to stop reading the file after reaching the end of the file, as opposed to looping over the lines in the file. +This file stream expects each line of the `neural_net_input.jsons` file to be a vector of doubles, encoded as a JSON object, and delimitated by newlines. The file is located in the `/root/data/` directory of the engine container. The `"Loop": false` line tells FastScore to stop reading the file after reaching the end of the file, as opposed to looping over the lines in the file. #### Streams via FastScore CLI -The FastScore CLI can be used to configure data streams. The `stream list` command displays a list of existing streams: +The FastScore CLI can be used to configure data streams. The `stream list` command displays a list of existing streams: ``` bash $ fastscore stream list @@ -536,18 +536,18 @@ By default, two demo file streams are included in FastScore. The demo-1 data set ``` json { - "type":"array", - "items": { - "type": "record", + "type":"array", + "items": { + "type": "record", "fields": [ - {"name":"x", "type":"double"}, - {"name":"y", "type":"string"}] + {"name":"x", "type":"double"}, + {"name":"y", "type":"string"}] } } ``` -These demo streams can be used to test whether or not a simple model is working correctly. -Additional streams can be added using the `fastscore stream add ` command. Existing streams can be sampled (displaying the most recent items of the stream) with `fastscore stream sample `. +These demo streams can be used to test whether or not a simple model is working correctly. +Additional streams can be added using the `fastscore stream add ` command. Existing streams can be sampled (displaying the most recent items of the stream) with `fastscore stream sample `. For filestreams, it is easiest to manage container input and output by linking a directory on the host machine to the engine container. This can be done in the Docker-Compose file by modifying the engine service to the following: @@ -561,8 +561,8 @@ For filestreams, it is easiest to manage container input and output by linking a CONNECT_PREFIX: https://127.0.0.1:8001 volumes: # new volume section - ./data:/root/data - - + + [...] ``` @@ -581,7 +581,7 @@ This will link the `./data` directory on the host machine to the `/root/data` di } ``` -A similar stream descriptor can be used for the output stream to write the output scores to a file in the same directory. +A similar stream descriptor can be used for the output stream to write the output scores to a file in the same directory. > When using Docker volume linking to link a directory on the host machine to the Engine instance, Docker must have privileges to read and write from the specified directory. Additionally, the directory on the container must be chosen carefully, as its contents will be overwritten with the contents of the corresponding host directory upon linking. `/root/data` is safe (as it only contains the demo datafiles), but other directories on the container (e.g., `/usr`) may not be. @@ -599,7 +599,7 @@ When using the Dashboard, models will begin scoring as soon as both the model an To run a model using the FastScore CLI, use the `fastscore job` sequence of commands: * `fastscore job run ` runs the model named `` with the specified input and output streams. - * `fastscore job stop` halts the currently running model. + * `fastscore job stop` halts the currently running model. * `fastscore job status` and `fastscore job statistics` display various information about the currently running job. Some of the statistics displayed by the `fastscore job statistics` command, such as memory usage, are also shown on the Dashboard.