This sample application demonstrates the IBM Watson Machine Learning Bluemix offering. It's an extension of the Big Data University Predicting Financial Performance of a Company course. While participation in this course is recommended, it's not required.
This application is based on the Node.js and Express framework. It uses the Watson Machine Learning service API to integrate with IBM SPSS Modeler analytics.
The application delivers an analytics-driven environment where one can explore time series from various perspectives and use the most suitable forecasting methods to forecast the future. With this sample application, you can:
- Download financial and economical time series from open data sources and explore them to observe general characteristics such as trend, seasonality, return distributions, and correlation between time series.
- Perform near-future forecasting based on historical data with a level of confidence so that we can use time series analysis and forecasting to solve our specific business problem.
For details, see this section.
- IBM ID to log in to Bluemix. See free trial if you don't yet have an ID.
- Cloud Foundry command line interface (only if you want to manually deploy to Bluemix)
- IBM SPSS Modeler (only if you want to modify stream or create new ones)
- Node.js 6.3.1 or higher runtime (only if you want to modify the source code)
The general, high-level steps are described below. Refer to IBM Watson Machine Learning Service for Bluemix - General for complete details.
- From the Bluemix catalog, choose the Watson Machine Learning and dashDB services, which will later bind to a Node.js application created from this sample. From this point, note that the service itself offers a set of samples (this particular one among them) that can be automatically deployed and bound, which is the simplest way to see the sample in action.
- Upload an SPSS Modeler stream file to your instance of the Watson Machine Learning service. This sample application comes with an SPSS Modeler stream (stream/financial-performance-prediction.str) that can be used for this purpose. The stream can also be created from scratch (see Preparing an SPSS Modeler stream).
For a fast start, you can deploy the prebuilt app to Bluemix by clicking the following button:
Note that the application is fully functional only if bound to instances of the Watson Machine Learning and dashDB services, which must be done manually. See the instructions.
As an alternative to the button, you can manually deploy the application to Bluemix by pushing it with Cloud Foundry commands, as described in the next section. Manual deployment is required when you want to deploy modified source code. Manual deployment consists of pushing the application to Bluemix followed by binding the Watson Machine Learning service to the deployed application.
To push an application to Bluemix, open a shell, change to the directory of your application, and run the following:
cf api <region>
where <region> part may be https://api.ng.bluemix.net or https://api.eu-gb.bluemix.net depending on the Bluemix region you want to work with (US or Europe, respectively)cf login
which is interactive; provide all required datacf push <app-name>
where <app-name> is the application name of your choice
cf push
can also read the manifest file (see Cloud Foundry Documentation). If you decide to use manifest, you can hardcode the names of your Watson Machine Learning and dashDB service instances instead of binding them manually. See the services section of the manifest.yml.template file.
If this is your first Bluemix Node.js application, see the node-helloworld project documentation to gain general experience.
See the instructions.
Running the application locally is useful when you want to test your changes before deploying them to Bluemix. For information about working with source code, see Source code changes.
When the changes are ready, open a shell, change the directory to your cloned repository, and run npm start
to start the application. The running application is available in a browser at http://localhost:4000. In develompent mode, the hot reloading feature is available at port 4010.
Applications that run locally can also use the Watson Machine Learning and dashDB Bluemix services. See the instructions.
The repository comes with prebuilt application. If you want to rebuild the app after modifying the sources:
- Follow steps listed in the Requirements section
- Change to the directory containing the downloaded source code or the cloned git repo
- Run
npm install
- Run
./node_modules/.bin/webpack
To add the power of IBM SPSS Modeler analytics to any application, use the Watson Machine Learning service API.
This particular application involves:
- Retrieving all models currently deployed to the Watson Machine Learning service (model API)
- Uploading a stream file to use in jobs (file API)
- Submitting TRAINING and BATCH_SCORE jobs against an uploaded Modeler stream file (service batch job API)
- Checking the status of a job (service batch job API)
- Deleting jobs (service batch job API)
- Deleting files (file API)
The code placed in the pa folder provides an example of how to use this API.
As stated in the Requirements section, from the Bluemix catalog you must order an instance of the Watson Machine Learning and dashDB services if you don't yet have them. The next step is to connect your deployed application to those services, which is called binding. There are a few ways to achieve this in the Bluemix environment. This document describes binding either via the Bluemix user interface or by using cf cli.
The application running locally can use Bluemix services if the credentials for the Watson Machine Learning and dashDB services are appropriately pasted into the ./config/local.json file. Complete the following steps:
- Deploy the application to Bluemix and bind it to the Watson Machine Learning service.
- Go to the application overview pane, choose bound Watson Machine Learning service, and click Show Credentials. Copy the (pm-20) credentials json portion (url, access_key).
- Create the ./config/local.json file by copying the ./config/local.json.template file. Edit the local.json file and paste the pm-20 credentials you obtained in the previous step.
- Perform similar steps for the dashDB service and its credentials.
- Start your local application. You should now be able to interact with the Watson Machine Learning service (for example, by listing the uploaded streams).
- Go to finance.yahoo.com.
- Type any company name or symbol of your interest. In this example, we'll type IBM in the search box and select IBM from the search results.
- Go to the Historical Data tab.
- Select a time range, click Apply, and click Download to save the dataset to your local machine.
- The saved file format is as follows:
- Open your application and click Import Custom Data.
- Browse to the file you downloaded from Yahoo Finance. Add a Company name and Symbol and click Import & Save Data.
The Symbol should be short and descriptive because the application will use it for chart legends.
As an alternative to downloading Yahoo Finance data, you can also import other data files. See the following guidelines.
- Your data must be in comma separated format (.csv), meaning that values should be separated by “,” in the file, as shown below:
- The data file must have at least two columns named Date and Adj Close. These columns will be filtered during import and used as the final dataset.
- Create a new stream.
- From the Source palette, add a Var. File node.
- Browse to the table.csv file.
- On the Annotations tab, rename the Var. File node to "in" and click OK.
- From the Field Ops palette, add a Type node.
- Set the VALUE field’s role to Both.
- Click Read Values and click OK.
- From the Modeling palette, add a Time Series node.
- Double-click the Time Series node, go to the Data Specifications tab, and set the Date/time field to DATE and set the Time interval to Months.
- Go to the Build Options tab and adjust the settings as follows:
- Go to the Model Options tab and select Extend records into the future.
- Click Run to run stream.
- From the Output palette, add a Table node to the stream.
- Right-click the Tablenode and select Use as Scoring Branch.
- The final stream is finished as shown below.
The code is available under the Apache License, Version 2.0.