# Урок 9. Интеграция. Итоговый проект

Программа урока:


- Вспоминаем “типовой” процесс построения модели машинного обучения 
- Этапы “обучение” и “предсказание”
- Введение в json и rest

#### 1. "Типовой" процесс построения модели

<img src='https://drive.google.com/uc?export=view&id=1cHdANVRHhmqdPwkeWRHHMnqmwmyckeY4'>

Также нужно понимать разницу между этапом обучения и предсказания

<img src='https://drive.google.com/uc?export=view&id=1DUPlPmLwhbxW7j9ZEVuQdColxeUd1NnP' width=700>

Если говорить проще, то обучение модели - маленькая (пусть и важная) часть процесса решения конкретной бизнес-задачи

Сама по себе обученная модель бесполезна, если нет возможности с ней взаимодействовать

#### Пример

Интернет-магазин "ходит" к нам за рекомендациями для пользователя.

Т.е нам передают user_id и просят вернуть топ k рекомендованных товаров.

В общем виде процесс "выкатывания" модели в прод может выглядеть так:

<img src='https://drive.google.com/uc?export=view&id=1J0ZW0O_5sl-WOE4FWfQT_CNTBmQWFUK7'>

Этап Model Deployment может быть очень сложным и нетривиальным во всем процессе (по сравнению с этапом обучения модели и подготовки данных).

#### Пример реализации проекта

<img src='https://drive.google.com/uc?export=view&id=1ymysdP0wVthPV6q8lZ6LKbaRh-6dHkjO' width=600>

https://hackmd.io/@AndreyPhys/flask-ml-api#/

### REST API

https://ru.wikipedia.org/wiki/REST

<img src='https://drive.google.com/uc?export=view&id=1waf6XV_4n6chq2gFnqUHErs87MfKqqQT' width=700>

### JSON

<img src='https://drive.google.com/uc?export=view&id=1okkgmx5vU1BQlWyW5k0djYkkF_czhn0k' width=700>

### Flask

- установить
- создать объект Flask
- написать обработчики (handler)
- запустить (и не выключать)

Создать объект Flask:



```
from flask import Flask, request, jsonify

app = Flask(__name__)  # __name__ - имя модуля
```


Написать обработчики (handler):



```
@app.route("/", methods=["GET"])
def general():
    return "Welcome to prediction process"
```


Запустить (и не выключать):


```
if __name__ == "__main__":
    app.run()
```




# Step 1 - TRAIN

### Обучение пайплайна

1. Загрузим данные https://www.kaggle.com/shivamb/real-or-fake-fake-jobposting-prediction
2. Соберем пайплайн с простейшим препроцессингом (tfidf) на текстовых данных
3. Обучим логистическую регрессию и сохраним на диск предобученный пайплайн

In [None]:
import pandas as pd
import dill
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix, accuracy_score, classification_report
from sklearn.metrics import roc_auc_score, roc_curve, precision_recall_curve
from sklearn.metrics import f1_score

#working with text
from sklearn.feature_extraction.text import TfidfVectorizer

#normalizing data
from sklearn.preprocessing import StandardScaler

#pipeline
from sklearn.pipeline import Pipeline, FeatureUnion
from sklearn.metrics import precision_score,recall_score

#imputer
from sklearn.impute import SimpleImputer

import sklearn.datasets

Загрузим данные

Ссылка на google drive: https://drive.google.com/file/d/1T0V44fGNCbgaWh78gZ2ieeoqo5eBowVi

In [None]:
!wget 'https://drive.google.com/uc?export=download&id=1T0V44fGNCbgaWh78gZ2ieeoqo5eBowVi' -O fake_job_postings.csv

--2022-06-10 17:13:55--  https://drive.google.com/uc?export=download&id=1T0V44fGNCbgaWh78gZ2ieeoqo5eBowVi
Resolving drive.google.com (drive.google.com)... 209.85.147.138, 209.85.147.102, 209.85.147.139, ...
Connecting to drive.google.com (drive.google.com)|209.85.147.138|:443... connected.
HTTP request sent, awaiting response... 303 See Other
Location: https://doc-10-c0-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/rust99jtvmfi2397cnku10jhpkok8mul/1654881225000/14904333240138417226/*/1T0V44fGNCbgaWh78gZ2ieeoqo5eBowVi?e=download [following]
--2022-06-10 17:13:59--  https://doc-10-c0-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/rust99jtvmfi2397cnku10jhpkok8mul/1654881225000/14904333240138417226/*/1T0V44fGNCbgaWh78gZ2ieeoqo5eBowVi?e=download
Resolving doc-10-c0-docs.googleusercontent.com (doc-10-c0-docs.googleusercontent.com)... 142.250.125.132, 2607:f8b0:4001:c2f::84
Connecting to doc-10-c0-docs.googleusercontent.com (doc-10-c0-doc

In [None]:
df = pd.read_csv("./fake_job_postings.csv")
df.head(3)

Unnamed: 0,job_id,title,location,department,salary_range,company_profile,description,requirements,benefits,telecommuting,has_company_logo,has_questions,employment_type,required_experience,required_education,industry,function,fraudulent
0,1,Marketing Intern,"US, NY, New York",Marketing,,"We're Food52, and we've created a groundbreaki...","Food52, a fast-growing, James Beard Award-winn...",Experience with content management systems a m...,,0,1,0,Other,Internship,,,Marketing,0
1,2,Customer Service - Cloud Video Production,"NZ, , Auckland",Success,,"90 Seconds, the worlds Cloud Video Production ...",Organised - Focused - Vibrant - Awesome!Do you...,What we expect from you:Your key responsibilit...,What you will get from usThrough being part of...,0,1,0,Full-time,Not Applicable,,Marketing and Advertising,Customer Service,0
2,3,Commissioning Machinery Assistant (CMA),"US, IA, Wever",,,Valor Services provides Workforce Solutions th...,"Our client, located in Houston, is actively se...",Implement pre-commissioning and commissioning ...,,0,1,0,,,,,,0


In [None]:
df['fraudulent'].value_counts()

0    17014
1      866
Name: fraudulent, dtype: int64

Разделим данные на train/test и сохраним тестовую выборку на диск

In [None]:
X_train, X_test, y_train, y_test = train_test_split(df, df['fraudulent'],
                                                    test_size=0.33, random_state=42)
# save test
X_test.to_csv("X_test.csv", index=None)
y_test.to_csv("y_test.csv", index=None)

# save train
X_train.to_csv("X_train.csv", index=None)
y_train.to_csv("y_train.csv", index=None)

In [None]:
class ColumnSelector(BaseEstimator, TransformerMixin):
    """
    Transformer to select a single column from the data frame to perform additional transformations on
    """
    def __init__(self, key):
        self.key = key

    def fit(self, X, y=None):
        return self

    def transform(self, X):
        return X[self.key]
    

class TextImputer(BaseEstimator, TransformerMixin):
    def __init__(self, key, value):
        self.key = key
        self.value = value
    
    def fit(self, X, y=None):
        return self
    
    def transform(self, X):
        X[self.key] = X[self.key].fillna(self.value)
        return X

In [None]:
features = ['description', 'company_profile', 'benefits']
target = 'fraudulent'

Соберем кусок, ответственный за feature engineering

In [None]:
# combine
description = Pipeline([
                ('imputer', TextImputer('description', '')),
                ('selector', ColumnSelector(key='description')),
                ('tfidf', TfidfVectorizer())
            ])

company_profile = Pipeline([
                ('imputer', TextImputer('company_profile', '')),
                ('selector', ColumnSelector(key='company_profile')),
                ('tfidf', TfidfVectorizer())
            ])

benefits = Pipeline([
                ('imputer', TextImputer('benefits', '')),
                ('selector', ColumnSelector(key='benefits')),
                ('tfidf', TfidfVectorizer())
            ])


feats = FeatureUnion([('description', description),
                      ('company_profile', company_profile),
                      ('benefits', benefits)])

Добавим простейший классификатор

In [None]:
%%time

pipeline = Pipeline([
    ('features', feats),
    ('classifier', LogisticRegression()),
])

pipeline.fit(X_train, y_train)

CPU times: user 5.55 s, sys: 2.39 s, total: 7.93 s
Wall time: 5.89 s


In [None]:
class ColumnSelector(BaseEstimator, TransformerMixin):
    """
    Transformer to select a single column from the data frame to perform additional transformations on
    """
    def __init__(self, key):
        self.key = key

    def fit(self, X, y=None):
        return self

    def transform(self, X):
        return X[self.key]
    

class TextImputer(BaseEstimator, TransformerMixin):
    def __init__(self, key, value):
        self.key = key
        self.value = value
    
    def fit(self, X, y=None):
        return self
    
    def transform(self, X):
        X[self.key] = X[self.key].fillna(self.value)
        return X
    
# combine
description = Pipeline([
                ('imputer', TextImputer('description', '')),
                ('selector', ColumnSelector(key='description')),
                ('tfidf', TfidfVectorizer())
            ])

company_profile = Pipeline([
                ('imputer', TextImputer('company_profile', '')),
                ('selector', ColumnSelector(key='company_profile')),
                ('tfidf', TfidfVectorizer())
            ])

benefits = Pipeline([
                ('imputer', TextImputer('benefits', '')),
                ('selector', ColumnSelector(key='benefits')),
                ('tfidf', TfidfVectorizer())
            ])


feats = FeatureUnion([('description', description),
                      ('company_profile', company_profile),
                      ('benefits', benefits)])

%%time

pipeline = Pipeline([
    ('features', feats),
    ('classifier', LogisticRegression()),
])

pipeline.fit(X_train, y_train)

Посмотрим, как выглядит наш pipeline

In [None]:
pipeline.steps

[('features', FeatureUnion(transformer_list=[('description',
                                  Pipeline(steps=[('imputer',
                                                   TextImputer(key='description',
                                                               value='')),
                                                  ('selector',
                                                   ColumnSelector(key='description')),
                                                  ('tfidf', TfidfVectorizer())])),
                                 ('company_profile',
                                  Pipeline(steps=[('imputer',
                                                   TextImputer(key='company_profile',
                                                               value='')),
                                                  ('selector',
                                                   ColumnSelector(key='company_profile')),
                                                  ('tfidf

Сохраним модель (пайплайн)

In [None]:
with open("logreg_pipeline.dill", "wb") as f:
    dill.dump(pipeline, f)

# Step 2 - PREDICT

### Проверка работоспособности и качества пайплайна

Здесь мы еще не запускаем никакое API, а загружаем модель (pipeline) напрямую и проверяем на отложенной (тестовой) выборке

In [None]:
X_test = pd.read_csv("X_test.csv")
y_test = pd.read_csv("y_test.csv")

In [None]:
X_test.head(3)

Unnamed: 0,job_id,title,location,department,salary_range,company_profile,description,requirements,benefits,telecommuting,has_company_logo,has_questions,employment_type,required_experience,required_education,industry,function,fraudulent
0,4709,Python Engineer,"GB, , London",,,,Stylect is a dynamic startup that helps helps ...,We don’t care where you studied or what your G...,We are negotiable on salary and there is the p...,0,1,0,Full-time,Entry level,Unspecified,Apparel & Fashion,Information Technology,0
1,11080,Entry Level Sales,"US, OH, Cincinnati",,55000-75000,,General Summary: Achieves maximum sales profit...,,Great Health and DentalFast Advancement Opport...,1,0,0,Full-time,Entry level,High School or equivalent,Financial Services,Sales,0
2,12358,Agile Project Manager,"US, NY, New York",,,ustwo offers you the opportunity to be yoursel...,"At ustwo™ you get to be yourself, whilst deliv...",Skills• Experience interfacing directly with c...,,0,1,0,,,,,,0


In [None]:
with open('logreg_pipeline.dill', 'rb') as in_strm:
    pipeline = dill.load(in_strm)

In [None]:
pipeline

Pipeline(steps=[('features',
                 FeatureUnion(transformer_list=[('description',
                                                 Pipeline(steps=[('imputer',
                                                                  TextImputer(key='description',
                                                                              value='')),
                                                                 ('selector',
                                                                  ColumnSelector(key='description')),
                                                                 ('tfidf',
                                                                  TfidfVectorizer())])),
                                                ('company_profile',
                                                 Pipeline(steps=[('imputer',
                                                                  TextImputer(key='company_profile',
                                                     

In [None]:
preds = pipeline.predict_proba(X_test)[:, 1]

pred_df = pd.DataFrame({'preds': preds})
pred_df.to_csv("test_predictions.csv", index=None)

In [None]:
preds[:10]

array([0.02602203, 0.04317015, 0.00370601, 0.00112958, 0.00151454,
       0.00213981, 0.00256837, 0.00373913, 0.00069803, 0.0122069 ])

In [None]:
precision, recall, thresholds = precision_recall_curve(y_test, preds)

fscore = (2 * precision * recall) / (precision + recall)
# locate the index of the largest f score
ix = np.argmax(fscore)
print(f'Best Threshold={thresholds[ix]}, F-Score={fscore[ix]:.3f}, Precision={precision[ix]:.3f}, Recall={recall[ix]:.3f}')

Best Threshold=0.20397799406875214, F-Score=0.852, Precision=0.900, Recall=0.809


# Step 3 - FLASK

## При внедрении

**При внедрении необходимо сделать:**
*   Определить формат json'а, в котором данные будут приниматься сервисом и отправляться обратно.
*   Определить ip-адрес и порт, на который будут поступать данные.
*   Создать во Flask необходимые роуты:<br/>
    `@app.route('/predict_example', method='POST')`<br/>
    `def predict_example():`
*   Перенести во Flask все функции преобразования данных,
    *   формат данные, приходящих от фронт-системы, может отличаться от формата исторических данных, использовавшихся при построении модели; в результате преобразований данные на вход модели должны поступить ровно в том виде, в каком была обучена модель.
*   Загрузить обученные модели.
*   Настроить логирование.

## Flask

Тут будет сервис для обработки запросов на Flask

Google Colab предоставляет виртуальную машину, поэтому мы не можем получить доступ к локальному хосту, как это делаем на нашем локальном компьютере при запуске локального веб-сервера. Что мы можем сделать, так это предоставить его общедоступному URL-адресу с помощью ngrok.

https://medium.com/@kshitijvijay271199/flask-on-google-colab-f6525986797b

In [None]:
!pip install flask-ngrok

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting flask-ngrok
  Downloading flask_ngrok-0.0.25-py3-none-any.whl (3.1 kB)
Installing collected packages: flask-ngrok
Successfully installed flask-ngrok-0.0.25


In [None]:
from flask_ngrok import run_with_ngrok
from flask import Flask, request, jsonify
import pandas as pd

https://dashboard.ngrok.com/get-started/setup

In [None]:
!wget https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.tgz
!tar -xvf /content/ngrok-stable-linux-amd64.tgz
!./ngrok authtoken 25vEpcJ5Ih4vlUp4thEZ9sEA6ZU_3Bnu17gKacRXhF6hLeefc
!./ngrok http 80

--2022-06-10 17:24:06--  https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.tgz
Resolving bin.equinox.io (bin.equinox.io)... 18.205.222.128, 54.237.133.81, 52.202.168.65, ...
Connecting to bin.equinox.io (bin.equinox.io)|18.205.222.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13770165 (13M) [application/octet-stream]
Saving to: ‘ngrok-stable-linux-amd64.tgz’


2022-06-10 17:24:07 (38.1 MB/s) - ‘ngrok-stable-linux-amd64.tgz’ saved [13770165/13770165]

ngrok
Authtoken saved to configuration file: /root/.ngrok2/ngrok.yml


In [None]:
# Пробный запуск Flask

app = Flask(__name__)
run_with_ngrok(app)  # Start ngrok when app is run

@app.route("/a")
def hello():
    return "Hello World!"

if __name__ == '__main__':
    app.run()

 * Serving Flask app "__main__" (lazy loading)
 * Environment: production
[2m   Use a production WSGI server instead.[0m
 * Debug mode: off


 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)


 * Running on http://d65a-35-202-7-199.ngrok.io
 * Traffic stats available on http://127.0.0.1:4040


127.0.0.1 - - [10/Jun/2022 17:24:36] "[33mGET / HTTP/1.1[0m" 404 -
127.0.0.1 - - [10/Jun/2022 17:24:36] "[33mGET /favicon.ico HTTP/1.1[0m" 404 -
127.0.0.1 - - [10/Jun/2022 17:24:59] "[33mGET / HTTP/1.1[0m" 404 -
127.0.0.1 - - [10/Jun/2022 17:25:06] "[37mGET /a HTTP/1.1[0m" 200 -


In [None]:
import pandas as pd
import dill

### **Создаем сервис для обработки запросов к модели**

In [None]:
# Загружаем обученные модели
with open('logreg_pipeline.dill', 'rb') as in_strm:
    model = dill.load(in_strm)

In [None]:
X_test = pd.read_csv("X_test.csv")
y_test = pd.read_csv("y_test.csv")

Запустить сервис и не глушить его, пока работаем 

In [None]:
# Обработчики и запуск Flask
app = Flask(__name__)
run_with_ngrok(app)  # Start ngrok when app is run


@app.route("/", methods=["GET"])
def general():
    return "Welcome to prediction process"

@app.route('/predict', methods=['POST'])
def predict():
    data = {"success": False}

    # ensure an image was properly uploaded to our endpoint
    description, company_profile, benefits = "", "", ""
    request_json = request.get_json()
    
    if request_json["description"]:
        description = request_json['description']
    
    if request_json["company_profile"]:
        company_profile = request_json['company_profile']
                
    if request_json["benefits"]:
        benefits = request_json['benefits']
    
    print(description)  
    preds = model.predict_proba(pd.DataFrame({'Home Ownership' : [home_ownership],
                                            'Annual Income' : [annual_income],
                                            'Years in current job' : years_job,
                                            'Tax Liens' : tax_liens,
                                            'Number of Open Accounts' : accounts,
                                            'Years of Credit History' : years_history,
                                            'Maximum Open Credit' : max_credit,
                                            'Number of Credit Problems' : n_problems,
                                            'Months since last delinquent' : last_delinquent,
                                            'Bankruptcies' : bankruptcies,
                                            'Purpose' : purpose,
                                            'Term' : term,
                                            'Current Loan Amount' : current_loan_amount ,
                                            'Current Credit Balance' : current_credit_balance,
                                            'Monthly Debt' : monthly_debt,
                                            'Credit Score' : credit_score}))
    data["predictions"] = preds[:, 1][0]
    data["description"] = description
        # indicate that the request was a success
    data["success"] = True
    print('OK')

        # return the data dictionary as a JSON response
    return jsonify(data)


if __name__ == '__main__':
    app.run()

 * Serving Flask app "__main__" (lazy loading)
 * Environment: production
[2m   Use a production WSGI server instead.[0m
 * Debug mode: off


 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)


 * Running on http://afcc-35-202-7-199.ngrok.io
 * Traffic stats available on http://127.0.0.1:4040


127.0.0.1 - - [10/Jun/2022 17:36:32] "[37mGET / HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:36:33] "[33mGET /favicon.ico HTTP/1.1[0m" 404 -
127.0.0.1 - - [10/Jun/2022 17:39:00] "[37mPOST /predict HTTP/1.1[0m" 200 -


Stylect is a dynamic startup that helps helps women discover and buy shoes. We’re a small team based in London that has previously worked at Google, Techstars, Pixelmator and Rocket Internet.We place a high premium on simplicity no matter what we’re working on (i.e. design, programming, marketing). We’re also a team that ships fast. We built version 1 of our app in a week, the next release (built in a month) was featured in the Apple Appstore Italy as a best new fashion app. Fast release cycles are challenging, but also very fun - which is why we love them. As we’ve grown, the projects that we’re working on have grown both in scale and in technical complexity.  Stylect is looking for someone who can help us improve our backend which gathers product data; analyses/categorizes it; and shows it to thousands of users daily. Each step in the process has unique challenges that demands a strong technical background.
OK


127.0.0.1 - - [10/Jun/2022 17:40:40] "[37mPOST /predict HTTP/1.1[0m" 200 -


Stylect is a dynamic startup that helps helps women discover and buy shoes. We’re a small team based in London that has previously worked at Google, Techstars, Pixelmator and Rocket Internet.We place a high premium on simplicity no matter what we’re working on (i.e. design, programming, marketing). We’re also a team that ships fast. We built version 1 of our app in a week, the next release (built in a month) was featured in the Apple Appstore Italy as a best new fashion app. Fast release cycles are challenging, but also very fun - which is why we love them. As we’ve grown, the projects that we’re working on have grown both in scale and in technical complexity.  Stylect is looking for someone who can help us improve our backend which gathers product data; analyses/categorizes it; and shows it to thousands of users daily. Each step in the process has unique challenges that demands a strong technical background. 
OK


127.0.0.1 - - [10/Jun/2022 17:40:53] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:54] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:54] "[37mPOST /predict HTTP/1.1[0m" 200 -


Stylect is a dynamic startup that helps helps women discover and buy shoes. We’re a small team based in London that has previously worked at Google, Techstars, Pixelmator and Rocket Internet.We place a high premium on simplicity no matter what we’re working on (i.e. design, programming, marketing). We’re also a team that ships fast. We built version 1 of our app in a week, the next release (built in a month) was featured in the Apple Appstore Italy as a best new fashion app. Fast release cycles are challenging, but also very fun - which is why we love them. As we’ve grown, the projects that we’re working on have grown both in scale and in technical complexity.  Stylect is looking for someone who can help us improve our backend which gathers product data; analyses/categorizes it; and shows it to thousands of users daily. Each step in the process has unique challenges that demands a strong technical background. 
OK
General Summary: Achieves maximum sales profitability, growth and account

127.0.0.1 - - [10/Jun/2022 17:40:54] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:54] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:54] "[37mPOST /predict HTTP/1.1[0m" 200 -


About EDITDEDITD runs the world's biggest apparel data warehouse, which global and local retailers use to track the market, align product assortment and trade with competitive intelligence. EDITD’s software is the market leader in real-time analytics of pricing, assortment, and deep product metrics for apparel professionals in merchandising, buying, trading and strategy. Used by the world’s best fashion retailers, like Gap, ASOS and Target, across five continents, EDITD helps buyers and merchandisers to make the right trading decisions.The JobIn a direct marketing role, you'll be promoting EDITD product to both existing and potential customers. Your job is to help increase sales by raising the profile of our business, through targeted promotional marketing campaigns and strategies. Your focus will be to deliver the best customer experience through all channels including the website, email and social media. You will work directly with our Marketing Director as well as other team members

127.0.0.1 - - [10/Jun/2022 17:40:54] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:54] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:54] "[37mPOST /predict HTTP/1.1[0m" 200 -


Customer Care Agent (Night Shift with Spanish)Are you looking for an opportunity to join an exciting company and be part of something really special?  Well how about this… ding* (known as ezetop in our past life) is looking for a vibrant and energetic Customer Care Agent to join our fast growing Online Operations team! Our Customer Operations Supervisors are searching for someone who is quick thinking, patient and passionate about providing a professional, world class Customer Care experience for our customers around the globe.Comprehensive on-going training will be provided but a positive, proactive attitude is the key to being successful in this role! You’ll also be contributing ideas and identifying key trends in queries from our customers and relaying this to the business.Here’s what you’ll do day to day:Manage and resolve customer and client queries raised by phone, email and live chatsIdentify emerging trends and issues and escalate these to your Team LeadEnsure our customers tak

127.0.0.1 - - [10/Jun/2022 17:40:54] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:54] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:54] "[37mPOST /predict HTTP/1.1[0m" 200 -


The JobBuild relationships with the mentor, investor, media, and partner ecosystem – communicate and manage for a long-term relationship with SeedcampWork closely with entrepreneurs applying to and participating in Seedcamp programmeOrganizing and involving mentors for Seedcamp Week, Academy, US tripSpeak about Seedcamp at public events - both Seedcamp and non-Seedcamp eventsUse startup products, stay up to date and knowledgeable about what new products and tech are emergingSupport the Seedcamp selection and investment processOften the first point of contact for our portfolioAbout YouLive and breathe technology and startups and are passionate about empowering founders and growing the startup communityCombination of gravitas (with the network) and being a peer with foundersGregarious, engaging, smart, high integrityHave experience working in startup environment, you have solid knowledge about marketing, growth and product as relevant for startups.Experience with managing content (events

127.0.0.1 - - [10/Jun/2022 17:40:54] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:54] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:55] "[37mPOST /predict HTTP/1.1[0m" 200 -


Job Summary:Positions, reworks, fits and welds according to written and typed instructions, blueprints, manufacturer's specifications and knowledge of characteristics of metal. Selects equipment and plans layout, fabrication, assembly and welding. Knowledge of Sub Arc Welding is a plus
OK
You Don't Know It AllYet.But you're not afraid to learn. In fact, you love learning. Can't get enough of it. And you're not completely green, you've been hanging around the marketing space for awhile and you know you're going to crush it.We're hiring someone amazing that knows a little about online marketing but wants to know it all. This is a paid internship that will lead to a permanent position.Please see site for full details BEFORE applying:#URL_b5fc7eb51433d61c8880e5fda02e183861992844172b638478ad87de73658003#
OK
The Videographer / Editor / Photographer will work closely with the creative team to shoot, edit, and produce short videos for the editorial and ecommerce partner sites. You will be invo

127.0.0.1 - - [10/Jun/2022 17:40:55] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:55] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:55] "[37mPOST /predict HTTP/1.1[0m" 200 -


Mass-market adaption of mobile and social is driving change at an unprecedented rate in the online travel sector. People are searching out new online experiences from research, booking, and right through to online feedback. This will create new commercial opportunities which we want to be prepared for.We’re looking for somebody who’s passionate about the underlying trends driving this change and identifying opportunities we should be working on.
OK
Looking for a change?  Not happy where you are?  Then give us a call!!Network Closing Services, Inc., a full service Title Company is seeking Title/Escrow Closers with a book of business.  We are growing, come join a winning team!Network Closing Services has been serving Lenders, Real Estate Consumers, and Professionals since 1999.  We provide courteous professional services, speedy title searches, and timely disbursements.  Dynamic flexibility is key to our success.  Our Client satisfaction is very important.  We provide experienced settlem

127.0.0.1 - - [10/Jun/2022 17:40:55] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:55] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:55] "[37mPOST /predict HTTP/1.1[0m" 200 -


1.    Technical Lead - Rhomobile Technical Mobility Lead Developer  NOTES-  In case if Rhomobile is difficult to find, Please find people with expertise in Android development with, DOJO, CSS Location: San Francisco/ San Ramon, CADuration: 1 year Required Competencies: Should have around 6-10 yrs of IT Experience- out of which minimum 2yrs as Tech Lead of Mobile ProjectsExperience in RhoMobile Platform for atleast 6 MonthsShould be experienced in Cross Platform Mobile apps.Proficiency in atleast one Native Platform – iOS or AndroidHands-on Experience essentialShould be able to lead a team and guide them in Technical challengesShould be able to estimate and plan for the various activitiesIntegration Experience with Backend Systems Essential. Out of this , SAP Integration Experience is desirableESRI ArcGIS Knowledge Essential  Thanks and Regards J.SandeepTechnical RecruiterTekWissen LLC | W: #PHONE_b464fe6050e48f0c36d00501265378e9581d5d65c73f8e39865543c69aaab557# | Desk: #PHONE_46ed5da44

127.0.0.1 - - [10/Jun/2022 17:40:55] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:55] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:55] "[37mPOST /predict HTTP/1.1[0m" 200 -


Looking for an experienced Java Architect for an immediate opening.
OK
The Customer Service Associate will be based in Richardson, TX. The right candidate will be an integral part of our talented team, supporting our continued growth.Responsibilities:Perform various Mail Center activities (sorting, metering, folding, inserting, delivery, pickup, etc.)Lift heavy boxes, files or paper when neededMaintain the highest levels of customer care while demonstrating a friendly and cooperative attitudeDemonstrate flexibility in satisfying customer demands in a high volume, production environmentConsistently adhere to business procedure guidelinesAdhere to all safety proceduresTake direction from supervisor or site managerMaintain all logs and reporting documentation; attention to detailParticipate in cross-training and perform other duties as assigned (Filing, outgoing shipments, etc)Operating mailing, copy or scanning equipmentShipping &amp; ReceivingHandle time-sensitive material like confiden

127.0.0.1 - - [10/Jun/2022 17:40:55] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:55] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:55] "[37mPOST /predict HTTP/1.1[0m" 200 -


Our Company, Replise, a growing and exciting social media analytics company has an immediate need for a Senior Front End Developer for a permanent position.In this role, you will collaborate with the dev team and cross-functionally (Designers, UX, and PMs) to create exciting and interactive experiences. This is a fast-paced environment that is always changing, yet stable and creative.Responsibilities:Work with the Front End and internal business teams to develop client softwareIdentify requirements and suggest solutions necessary to meet those requirementsLead development to a completed solutionServe as a resource for scoping and scheduling of projectsWrite standards-compliant Front End code using Javascript, CSS, and HTMLTranslate visual designs, user experience flows, and content into functional and engaging interfacesChoose proper technologies based on requirements and design
OK
Fleksy, is the next generation smart keyboard that lets you type on a touch-screen, without even looking 

127.0.0.1 - - [10/Jun/2022 17:40:56] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:56] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:56] "[37mPOST /predict HTTP/1.1[0m" 200 -


Let's post here candidates we like but don't have an open position for.
OK
We are currently looking for an Administrative Assistant to work on-site in our Bellingham office.  Successful candidates will be highly capable in a vast array of administrative skills and be highly flexible to meet the changing needs of the company.  This position will work directly with the Corporate and Accounting team and perform many duties requiring proficiency with computer usage, excellent communication skills, and top-notch organization skills.  A successful candidate must be able to juggle the needs of multiple managers while organizing their schedule to ensure priority goes to the most critical duties. Key Responsibilities:Provide administrative support to executives and finance department managerBook travel arrangementsHelp plan and execute company eventsData entry, including entering bills and charges into financial software. Assist the finance department with Account Receivable analysis and commun

127.0.0.1 - - [10/Jun/2022 17:40:56] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:56] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:56] "[37mPOST /predict HTTP/1.1[0m" 200 -


Experienced Restaurant Accountant and Operations Manager NeededIf you can bring energy, enthusiasm and enjoy directly advising business owners; we would like to meet you. O|Miga is a fast growing, St Louis based accounting and back office services firm with national and local clients. We provide accounting, payroll, HR and other support services to our fast growing and high performing organizations.  One of our key and expanding customer segments is the restaurant industry, and we are looking for candidates with experience in restaurant operations, finance and accounting.   Our unique delivery model enables our team to help our clients to improve and grow their operations.  Our clients are able to grow faster because we deliver more than just the numbers. We get the work done that prevents them from focusing on their core business.Our Associates have a broad base of business knowledge with specific skills in accounting, finance  and human resources. They must combine these with excepti

127.0.0.1 - - [10/Jun/2022 17:40:56] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:56] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:56] "[37mPOST /predict HTTP/1.1[0m" 200 -


SimilarWeb is on search of a driven Head of Performance Marketing, to develop and evolve acquisition strategy. You will work across PPC, Media Buying and Affiliate Marketing in order to generate traffic.As Head of Performance Marketing you will oversee the demand generation strategy, execution and budgets and be accountable for the traffic generation results.Key ResponsibilitiesProducing traffic generation goals and  forecastsManaging large marketing budgetsMonitoring and analyzing performance against traffic, CPA, and ROI goalsContributing to integrated campaign strategy and plans, outlining key marketing themes, messages and offers throughout the yearOverseeing the campaign planning and strategy
OK
Successful and profitable start-up looking for an Ad Monetisation expertMAG Interactive has rapidly become one of the fastest growing mobile gaming companies in Sweden and is well recognised worldwide.More than 50 million fans enjoy playing Ruzzle and QuizCross and have played more than 10

127.0.0.1 - - [10/Jun/2022 17:40:56] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:56] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:56] "[37mPOST /predict HTTP/1.1[0m" 200 -


The Client Account Manager will be located in Windsor CT.  The right candidate will be an integral part of our talented team, supporting our continued growth. The Client Account Manager is responsible for ensuring Service Level Agreements (SLAs) are met on a consistent basis  (Mail, Document Imaging/Scanning and Fulfillment production workflow) communicating workflow status to operations management and working with customer to fulfill contractual obligations, identify business opportunity and grow service offering.  In order to achieve these objectives, the Client Account Manager will be expected to maintain a high level of client contact and customer satisfaction (SLA attainment) and develop site personnel to meet or exceed customer and Novitex objectives. Candidates must have High School Diploma, proven leadership experience, minimum of 2 years supervisory experience, minimum of 3 years customer service experience and a minimum of 1 year experience in a Document Imaging and Indexing 

127.0.0.1 - - [10/Jun/2022 17:40:57] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:57] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:57] "[37mPOST /predict HTTP/1.1[0m" 200 -


This role is primarily to farm and grow our Houston, TX based clients and prospects Build Visual BI's BI Center of Excellence and BI Practice Competencies Become Integral Part of Visual BI's Vision to be the Best BI Consulting and Solutions Firm Execute BI Strategy by leveraging SAP BW and HANA capabilities as Enterprise Data Warehouse(EDW). Provide solutions architecture oversight for new development projects in support of our client's BI programBuild Project Plan timelines and Ensure BI Project Executions to those timelines and budget. Ensure adoption of best-in-class practices and standards for development, support, quality control and documentationWork with stakeholders to analyze business requirements, and define target SAP BI/BW solution architecture and associated technical specifications &amp; implementation planLead large cross functional teams including client staff and implementation team to accomplish successful completion of one or more solution requirements, architecture,

127.0.0.1 - - [10/Jun/2022 17:40:57] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:57] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:57] "[37mPOST /predict HTTP/1.1[0m" 200 -


To apply please visit our website at #URL_06ae9636e61d7ddfc75b7dec9887f7022036b464a1ef22d098f1e03084cd3614# and click on our Careers page.Tidewater Finance Company is seeking full-time RECOVERY SPECIALISTS. Join a growing team of high performance professionals in a team-oriented environment! The qualified applicant must be able to:Properly and independently work assigned accounts to locate customer and/or collateral by performing advanced loss prevention activities Perform basic and advanced skip-tracing with the use of internal and external skip-tracing resourcesNegotiate account resolution and accurately input and document all actions within the collections systemMonitor and measure performance of third party repo agents and other outside vendors to ensure goals are achieved in the most cost effective mannerEnsure all company policies and procedures are adhered toAlert management of potential risk exposure The qualifications for this position include:A professional demeanorAdaptabili

127.0.0.1 - - [10/Jun/2022 17:40:57] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:57] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:57] "[37mPOST /predict HTTP/1.1[0m" 200 -


Please apply for the position as a .Net Developer at In2media by clicking the "Apply for this job"-button below.We are looking forward to receiving your application.In2media
OK
Kappa Search Inc. is a Chicago technical staffing company that specializes in engineering, manufacturing, technical sales and supply chain recruitment and placement. We are currently recruiting for a mechanical design engineer with product development experience. Responsibilities and requirements as follows.Responsibilities:Responsible for producing 3D CAD and other technical drawings for all new product designs and existing products.To provide the required design services (such as CAD &amp; CAE) for NPD and support of existing products and their application, as well as for the modification and further development of these same products.Prepares design concepts, layouts, assembly drawings and schematics as part of the development team.Material and component selection; develops specs for product.Pepares cost esti

127.0.0.1 - - [10/Jun/2022 17:40:57] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [10/Jun/2022 17:40:57] "[37mPOST /predict HTTP/1.1[0m" 200 -


At Intent HQ we’re tackling some seriously difficult problems, right at the cutting edge of deep consumer analysis. We model user interests and apply this insight to solve challenging consumer problems at scale. Want to draw insights from 20 million detailed social network profiles? In realtime? We do.To help us innovate faster, we’re building a new R&amp;D group. This team is responsible for researching, designing and prototyping algorithms in the machine learning and NLP space. We have an engineering team responsible for the overall platform, who you will work closely with to bring prototypes to production.
OK
Our client is searching for a strong candidate that has 2 years Grails development experience as well as 3 years Java development experience.This is an excellent opportunity for an ambitious, creative developer that wants to innovate and build what doesn’t exist yet. Working remotely for a well funded company with an existing distributed team. Our client is a cloud platform pro

Сторона клиента: https://colab.research.google.com/drive/1UK_ToiHKZaZhKt8nZmAlWDIb0Nhz_c5X

Тестовый клиент

In [None]:
# Пример данных
description_data, company_profile_data, benefits_data = ( 
    "Stylect is a dynamic startup that helps helps women discover and buy shoes. We’re a small team based in London that has previously worked at Google, Techstars, Pixelmator and Rocket Internet.We place a high premium on simplicity no matter what we’re working on (i.e. design, programming, marketing). We’re also a team that ships fast. We built version 1 of our app in a week, the next release (built in a month) was featured in the Apple Appstore Italy as a best new fashion app. Fast release cycles are challenging, but also very fun - which is why we love them.\xa0As we’ve grown, the projects that we’re working on have grown both in scale and in technical complexity. \xa0Stylect is looking for someone who can help us improve our backend which gathers product data; analyses/categorizes it; and shows it to thousands of users daily. Each step in the process has unique challenges that demands a strong technical background.",
    "ustwo offers you the opportunity to be yourself, whilst delivering the best work on the planet for some of the biggest and most innovative brands. A culture thriving on collaboration underpins what is an amazing work smart/ live well environment.We genuinely care about the work that we deliver and the people who help make it all possible. We only invest in projects, people and practices that we believe in, to ensure we remain excited about every opportunity.",
    "We are negotiable on salary and there is the potential for equity for the right candidate."
)

body = {
        'description': description_data, 
        'company_profile': company_profile_data,
        'benefits': benefits_data
        }

In [None]:
with app.test_client() as t:
    response = t.post('/predict', json=body)
    json_data = response.get_json()

json_data

Stylect is a dynamic startup that helps helps women discover and buy shoes. We’re a small team based in London that has previously worked at Google, Techstars, Pixelmator and Rocket Internet.We place a high premium on simplicity no matter what we’re working on (i.e. design, programming, marketing). We’re also a team that ships fast. We built version 1 of our app in a week, the next release (built in a month) was featured in the Apple Appstore Italy as a best new fashion app. Fast release cycles are challenging, but also very fun - which is why we love them. As we’ve grown, the projects that we’re working on have grown both in scale and in technical complexity.  Stylect is looking for someone who can help us improve our backend which gathers product data; analyses/categorizes it; and shows it to thousands of users daily. Each step in the process has unique challenges that demands a strong technical background.
OK


{'description': 'Stylect is a dynamic startup that helps helps women discover and buy shoes. We’re a small team based in London that has previously worked at Google, Techstars, Pixelmator and Rocket Internet.We place a high premium on simplicity no matter what we’re working on (i.e. design, programming, marketing). We’re also a team that ships fast. We built version 1 of our app in a week, the next release (built in a month) was featured in the Apple Appstore Italy as a best new fashion app. Fast release cycles are challenging, but also very fun - which is why we love them.\xa0As we’ve grown, the projects that we’re working on have grown both in scale and in technical complexity. \xa0Stylect is looking for someone who can help us improve our backend which gathers product data; analyses/categorizes it; and shows it to thousands of users daily. Each step in the process has unique challenges that demands a strong technical background.',
 'predictions': 0.001129359819418078,
 'success': Tr

# Домашнее задание

**Стандартная версия**
Нужно реализовать rest api на базе flask в google colab.

1. выбрать себе датасет (который интересен или нравится больше всего, можно глянуть здесь https://economic-caper-a4c.notion.site/d062c410b90145bca90fc23b1348c813), сделать pipeline (преобразования + модель), сохранить его на диск. Если не хочется пайплайн, то можно без него, но так вам же будет удобнее потом вызывать его из кода сервиса.
2. Реализовать ноутбук с сервером
3. Реализовать ноутбук с клиентом


**Сложная версия**

Нужно реализовать rest api на базе flask (пример https://github.com/fimochka-sudo/GB_docker_flask_example)

1. выбрать себе датасет (который интересен или нравится больше всего, можно глянуть здесь https://economic-caper-a4c.notion.site/d062c410b90145bca90fc23b1348c813), сделать pipeline (преобразования + модель), сохранить его на диск. Если не хочется пайплайн, то можно без него, но так вам же будет удобнее потом вызывать его из кода сервиса.
2. для вашего проекта вам понадобится requirements.txt с пакетами. Можно за основу взять такой файл из проекта выше. Для его установки прям в pycharm можно открыть терминал и сделать pip install -r requirements.txt (находясь в корне проекта)
3. итоговый проект должен содержать:
    1. каталог app/models/ (здесь модель-пайплайн предобученная либо код обучения модели-пайплайна)
    2. файл app/run_server.py (здесь основной код flask-приложения)
    3. requirements.txt (список пакетов, которые у вас используются в проекте - в корне проекта)
    4. README.md (здесь какое-то описание, что вы делаете, что за данные, как запускать и т.д)
    5. *Dockerfile
    6. *docker-entrypoint.sh
4. *front-end сервис какой-то, который умеет принимать от пользователя введеные данные и ходить в ваш api. На самом деле полезно больше вам, т.к если ваш проект будет далее развиваться (новые модели, интересные подходы), то это хороший пунктик к резюме и в принципе - строчка в портфолио)

## Полезные ссылки:

1. датасеты (для полета мысли): https://www.kaggle.com/datasets
2. конкурс Сбербанка по недвижимости (можно этот набор данных также взять и обучить модель предсказывать стоимость жилья - неплохой такой сервис может получиться) - https://www.kaggle.com/c/sberbank-russian-housing-market/data Там же и ноутбуки с разными подходами есть.
3. минималистичный пример связки keras/flask https://blog.keras.io/building-a-simple-keras-deep-learning-rest-api.html для определения класса картинки
4. неплохой такой пример связки docker/flask - https://cloud.croc.ru/blog/byt-v-teme/flask-prilozheniya-v-docker/
5. https://www.digitalocean.com/community/tutorials/how-to-build-and-deploy-a-flask-application-using-docker-on-ubuntu-18-04
6. https://flask.palletsprojects.com/en/2.1.x/


In [4]:
import numpy as np

In [11]:
a, b = [1] *2

In [12]:
type(b)

int