You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(Yandex Cloud Object Storage example, make sure that you config your aws via aws configure with your key_id and secret key before using this command)
Edit the my.env file the project folder with your specific parameters
PUBLIC_SERVER_IP=<your_public_ip> #insert
MLFLOW_S3_ENDPOINT_URL=<endpoint_irl> #like https://storage.yandexcloud.net
AWS_DEFAULT_REGION=<defult_region> #like ru-central1
AWS_ACCESS_KEY_ID=<your_key_id> #insert yours
AWS_SECRET_ACCESS_KEY=<your_secret_key> #insert yours
BACKEND_URI=sqlite:////mlflow/database/mlops-project.db #leave it as it is
ARTIFACT_ROOT=s3://<your_bucket_name>/mlflow-artifacts/ #insert your bucket_name
!!! MLFLOW_S3_ENDPOINT_URL is needed for analogs of AWS s3 bucket.
So if your are using original AWS looks like you can delete this row.
(i am not sure of it - because i have no chances to test deployment on AWS)
And if so - you need to go to project folder and correct docker-compose.yml - comment rows with MLFLOW_S3_ENDPOINT_URL in environment blocks for all services
In the case of original AWS rows in the docker-compose.yml with MLFLOW_S3_ENDPOINT_URL should look like this # MLFLOW_S3_ENDPOINT_URL: "${MLFLOW_S3_ENDPOINT_URL}"
Finally run from project folder under your default base environment
bash run-venv.sh
it will create virtual environments for project services
You wil find yourself in orchestration_manager venv that will be used on the following steps and can be activated from a new terminal by running pipenv shell out of the **orchestration_manager ** folder
Open new terminal and from the project folder under the base env, run
bash run-tests.sh
it will launch unit and integration tests
You can use terminal from stage 3 - check that ports for docker-compose is empty docker ps
if needed docker kill container_id
Service is using
5001 for MLFlow
4200 for Prefect Orion
3000 for grafana
9696 for prediction service
9898 for manager service
From project folder under the orchestration_manager env being activated previously (same terminal ) run
bash run-services.sh
it will install and launch all services in docker-compose and will start Prefect Orion 2 server
It is the most important step and its better to check that everything works fine:
Now you can open in your browser following UIs:
Prefect UI
MLFlow
Grafana
tunneled ports can variate depends on your IDE
format: localhost or public_ip:tunneled_port
Open new terminal and run under the orchestration_manager env from project folder
bash run-manager.sh
it will create prefect deployments and prefect queue and will launch prefect agent
Open new terminal, from project folder go to orchestration_manager folder and run
pipenv shell
to activate venv
return to project foldercd ..
All services are started and ready to work!
Its need to train starting model - better to do it from Prefect UI.
Open 127.0.0.1:your-tunneled-port in browser (something like 127.0.0.1:4200 or 4201) - Prefect UI
From Deployments > **retrain_request** press **retrain-model** > and RUN it with button on the right corner
Trainig process you can see on the log of Prefect Agent terminal
When the first model will be created it will automaticly will be promoted to Production stage and
you can imitate of sending data to prediction service
From terminal from project folder under the orchestration env
(cd orchestration_manager > pipenv shell > cd ..)
run python send_data.py with parameters of data in format yyyy-mm-dd and number of records to send
(dataset for every month consist of a few thousand so better to use just a few dozens or hundreds for review), like
python send_data.py 2015-05-30 200
When all rows will be sended and logged in project folder/targets monthly report can be created. Dont need to wait till the end of month - lets manually RUN from Prefect UI
Deployments > **batch_analyze** press monitoring report > RUN
You can watch the result of report in the logs of Prefect Agent terminal
Report will be created and saved in project folder/reports you can download it by pressing left-click in VSCode
(I dont save it in the bucket:( )
When report is created and the model drift is taking place it is possible to run a retrain (manually for review)
By default on the end of month prefect agent will start deployment for creating report on the data for the latest month.
It saves evidently report to project/reports folder and will estimate is there a model drift of not.
After that retrain service will give a questiong to manager service is it need to retrain a model on the latest data and if there was a drift manager will return True.
You can play with the manager service and run data from different month from 2015-1 to 2015-7.
The logic of manager service is the following:
report creation and retrain can be lauched at start freely
new data was sended > report can be created - else waiting for new data
report created and drift detected > possible to retrain - else waiting for new report
If report is created, manager will not allow to create another one on the same data - need to load new data
(just run report creation manually via Prefect UI twice one by one and watch logs)
If model was retrained, manager will not allow to retrain again - need to send a new data and create a new report
(just run retrain-model manually via Prefect UI twice one by one and watch logs)
I use this logic in order to manager-service doesnt try to give a signal niether for report creation nor for retraining of model on the old data. Its need to send new batch.
!!! But you always can launch retrain process without any restrictions via Prefect UI by running Deployments > main > initial-train
The text was updated successfully, but these errors were encountered:
If everything works fine in normal conditions you can test the following thing:
Break prediction service via manually promoting current Production model to None stage in MLFlow UI
Try to send data - it fails as soon as there is no prediction model
Try to create report via Prefect UI manually - it works but answer that waiting for new data
Try to launch of retrainig process via Prefect UI retrain_request > retrain-model manually - it will answer that there no request for trainig because report was not created
Launch retraining via Prefect UI main > initiall-retrain - it will train a new model anb will promote it to Production stage
Wait a few seconds in order to Flask gets new model for prediction service
An important addition to Fast run section of Readme - its need to train initial model via Prefect UI > Deployments > retrain_request press retrain-model > RUN
git clone https://github.com/K0nkere/kkr-mlops-project.git
It will create kkr-mlops-project folder that contains my code - in the following i will call it project folder
Create your own bucket with name <your_bucket_name> in the Cloud Service UI or with CLI command if you havent one already
(Yandex Cloud Object Storage example, make sure that you config your aws via
aws configure
with your key_id and secret key before using this command)!!! MLFLOW_S3_ENDPOINT_URL is needed for analogs of AWS s3 bucket.
So if your are using original AWS looks like you can delete this row.
(i am not sure of it - because i have no chances to test deployment on AWS)
And if so - you need to go to project folder and correct docker-compose.yml - comment rows with MLFLOW_S3_ENDPOINT_URL in environment blocks for all services
In the case of original AWS rows in the docker-compose.yml with MLFLOW_S3_ENDPOINT_URL should look like this
# MLFLOW_S3_ENDPOINT_URL: "${MLFLOW_S3_ENDPOINT_URL}"
Finally run from project folder under your default base environment
it will create virtual environments for project services
You wil find yourself in orchestration_manager venv that will be used on the following steps and can be activated from a new terminal by running
pipenv shell
out of the **orchestration_manager ** folderit will launch unit and integration tests
docker ps
if needed
docker kill container_id
Service is using
it will install and launch all services in docker-compose and will start Prefect Orion 2 server
It is the most important step and its better to check that everything works fine:
Now you can open in your browser following UIs:
Prefect UI
MLFlow
Grafana
tunneled ports can variate depends on your IDE
format: localhost or public_ip:tunneled_port
it will create prefect deployments and prefect queue and will launch prefect agent
to activate venv
return to project folder
cd ..
All services are started and ready to work!
Open 127.0.0.1:your-tunneled-port in browser (something like 127.0.0.1:4200 or 4201) - Prefect UI
From
![Prefect_1](https://user-images.githubusercontent.com/101024338/189550166-749cc4b8-0010-4aa7-8fb1-af5209e88bc7.JPG)
Deployments > **retrain_request** press **retrain-model** > and RUN it with button on the right corner
Trainig process you can see on the log of Prefect Agent terminal
When the first model will be created it will automaticly will be promoted to Production stage and
you can imitate of sending data to prediction service
(
cd orchestration_manager > pipenv shell > cd ..
)run python send_data.py with parameters of data in format yyyy-mm-dd and number of records to send
(dataset for every month consist of a few thousand so better to use just a few dozens or hundreds for review), like
When all rows will be sended and logged in project folder/targets monthly report can be created. Dont need to wait till the end of month - lets manually RUN from Prefect UI
You can watch the result of report in the logs of Prefect Agent terminal
Report will be created and saved in project folder/reports you can download it by pressing left-click in VSCode
(I dont save it in the bucket:( )
When report is created and the model drift is taking place it is possible to run a retrain (manually for review)
![Training__switch](https://user-images.githubusercontent.com/101024338/189550688-60279906-2e02-4b89-88da-bf48e1cdeca9.JPG)
By default on the end of month prefect agent will start deployment for creating report on the data for the latest month.
It saves evidently report to project/reports folder and will estimate is there a model drift of not.
After that retrain service will give a questiong to manager service is it need to retrain a model on the latest data and if there was a drift manager will return True.
The logic of manager service is the following:
If report is created, manager will not allow to create another one on the same data - need to load new data
![Report_no_new_data](https://user-images.githubusercontent.com/101024338/189550361-547ab0fa-5e9c-43e9-a461-7b0bee54b444.JPG)
(just run report creation manually via Prefect UI twice one by one and watch logs)
If model was retrained, manager will not allow to retrain again - need to send a new data and create a new report
![Training_no_request](https://user-images.githubusercontent.com/101024338/189550364-5facf13d-eb88-4f5a-8489-a9bf90cf22ef.JPG)
(just run retrain-model manually via Prefect UI twice one by one and watch logs)
I use this logic in order to manager-service doesnt try to give a signal niether for report creation nor for retraining of model on the old data. Its need to send new batch.
!!! But you always can launch retrain process without any restrictions via Prefect UI by running Deployments > main > initial-train
The text was updated successfully, but these errors were encountered: