Skip to content
This repository has been archived by the owner. It is now read-only.
Permalink
Browse files
Merge pull request #123 from radibnia77/master
update build.gradle
  • Loading branch information
xun-hu-at-futurewei-com committed Feb 24, 2021
2 parents 7871f56 + da93ca9 commit f1d046e3bee0c6269f18164651fc239fc5cca297
Showing 118 changed files with 4,028 additions and 4,029 deletions.
@@ -1,5 +1,4 @@

IMService/gradlew
IMService/gradlew.bat
IMService/gradle/wrapper/gradle-wrapper.jar
IMService/gradle/wrapper/gradle-wrapper.properties
IMService/gradle/wrapper/gradle-wrapper.properties
@@ -3,4 +3,4 @@
**/out/**
**/.idea/**
IMService/gradle/**
IMService/gradle*
IMService/gradle*
@@ -27,8 +27,8 @@ subprojects {
compile.exclude group: 'ch.qos.logback'
}
repositories {
maven { url "https://maven.google.com" }
maven { url "https://plugins.gradle.org/m2/" }
maven { url 'https://repo1.maven.org/maven2/' }
maven { url 'https://maven.google.com' }
}

dependencies {
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

Large diffs are not rendered by default.

@@ -1,32 +1,32 @@
### What is predictor_dl_model?
predictor_dl_model is a suite of offline processes to forecast traffic inventory. The suite contains the following modules. More information is included in the module’s directory.

1. datagen: This module generates factdata table which contains traffic data.
2. trainer: This module builds and trains a deep learning model based on the factdata table.
3. pipeline: This module processes factdata table into training-ready data which is used to train the neural network.

### Prerequisites
Cluster: Spark 2.3/HDFS 2.7/YARN 2.3/MapReduce 2.7/Hive 1.2
Driver: Python 3.6, Spark Client 2.3, HDFS Client, tensorflow-gpu 1.10

To install dependencies run:
pip install -r requirements.txt


### Install and Run
1. Download the blue-martin/models project
2. Transfer the predictor_dl_model directory to ~/code/predictor_dl_model/ on a GPU machine which also has Spark Client.
3. cd predictor_dl_model
4. pip install -r requirements.txt to install required packages. These packages are install on top of python using pip.
5. python setup install (to install predictor_dl_model package)
6. (optional) python set_up.py bdist_egg (to create .egg file to provide to spark-submit)
7. Follow the steps in ~/code/predictor_dl_model/datagen/README.md to generate data
8. Go to directory ~/code/predictor_dl_model/predictor_dl_model
9. Run run.sh or each script individually


### Documentation
Documentation is provided through comments in config.yml and README files

### Note
saved_model_cli show --dir <model_dir>/<version> --all
### What is predictor_dl_model?
predictor_dl_model is a suite of offline processes to forecast traffic inventory. The suite contains the following modules. More information is included in the module’s directory.

1. datagen: This module generates factdata table which contains traffic data.
2. trainer: This module builds and trains a deep learning model based on the factdata table.
3. pipeline: This module processes factdata table into training-ready data which is used to train the neural network.

### Prerequisites
Cluster: Spark 2.3/HDFS 2.7/YARN 2.3/MapReduce 2.7/Hive 1.2
Driver: Python 3.6, Spark Client 2.3, HDFS Client, tensorflow-gpu 1.10

To install dependencies run:
pip install -r requirements.txt


### Install and Run
1. Download the blue-martin/models project
2. Transfer the predictor_dl_model directory to ~/code/predictor_dl_model/ on a GPU machine which also has Spark Client.
3. cd predictor_dl_model
4. pip install -r requirements.txt to install required packages. These packages are install on top of python using pip.
5. python setup install (to install predictor_dl_model package)
6. (optional) python set_up.py bdist_egg (to create .egg file to provide to spark-submit)
7. Follow the steps in ~/code/predictor_dl_model/datagen/README.md to generate data
8. Go to directory ~/code/predictor_dl_model/predictor_dl_model
9. Run run.sh or each script individually


### Documentation
Documentation is provided through comments in config.yml and README files

### Note
saved_model_cli show --dir <model_dir>/<version> --all
File renamed without changes.

Large diffs are not rendered by default.

@@ -1,3 +1,3 @@
{
"python.pythonPath": "/home/reza/anaconda3/envs/py27/bin/python"
{
"python.pythonPath": "/home/reza/anaconda3/envs/py27/bin/python"
}
@@ -1,21 +1,21 @@
### Pipleline Steps
Pipeline takes the following steps:

1. Reads factdata from hive table from day(-365)(configurable) to day(-1), input is day(-1). day(0) is today and day(-1) is yesterday.
2. Processes data using spark and writes results into tfrecords e.g. factdata.tfrecords.<date> (configurable)
3. Starts trainer to read the rfrecords and create the model
4. Writes model into local directory
5. Compare the new model and old model (new model evaluation)(future)
6. Set the predictor to use the new model - predictor reads the name of the model that it uses from Ealsticsearch (future)

### UCKEY Elements
uckey consists of the following items.

ucdoc.m = parts[0] #media-type
ucdoc.si = parts[1] #slot-id
ucdoc.t = parts[2] #connection-type
ucdoc.g = parts[3] #gender
ucdoc.a = parts[4] #age
ucdoc.pm = parts[5] #price-model
ucdoc.r = parts[6] #resident-location
ucdoc.ipl = parts[7] #ip-location
### Pipleline Steps
Pipeline takes the following steps:

1. Reads factdata from hive table from day(-365)(configurable) to day(-1), input is day(-1). day(0) is today and day(-1) is yesterday.
2. Processes data using spark and writes results into tfrecords e.g. factdata.tfrecords.<date> (configurable)
3. Starts trainer to read the rfrecords and create the model
4. Writes model into local directory
5. Compare the new model and old model (new model evaluation)(future)
6. Set the predictor to use the new model - predictor reads the name of the model that it uses from Ealsticsearch (future)

### UCKEY Elements
uckey consists of the following items.

ucdoc.m = parts[0] #media-type
ucdoc.si = parts[1] #slot-id
ucdoc.t = parts[2] #connection-type
ucdoc.g = parts[3] #gender
ucdoc.a = parts[4] #age
ucdoc.pm = parts[5] #price-model
ucdoc.r = parts[6] #resident-location
ucdoc.ipl = parts[7] #ip-location
@@ -1,18 +1,18 @@
# Copyright 2019, Futurewei Technologies
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# * "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# Copyright 2019, Futurewei Technologies
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# * "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
@@ -1,108 +1,108 @@
# Copyright 2019, Futurewei Technologies
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# * "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

import json
import requests
from elasticsearch import Elasticsearch
import yaml
import argparse

class ESClient:

def __init__(self, host, port, es_index, es_type):
self.es_index = es_index
self.es_type = es_type
self.es = Elasticsearch([{'host': host, 'port': port}])

def __put(self, uckey, dict=dict):
dict_res = self.