Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot start mysql container on arm v8 #22

Closed
helloerikaaa opened this issue Jan 3, 2023 · 3 comments
Closed

Cannot start mysql container on arm v8 #22

helloerikaaa opened this issue Jan 3, 2023 · 3 comments

Comments

@helloerikaaa
Copy link

helloerikaaa commented Jan 3, 2023

I'm trying to setup a mlflow server on my raspberry pi 4b with Ubuntu 22.10 but when doing docker-compose up -d the following error shows up.

Creating network "mlflow-docker_internal" with the default driver
Creating network "mlflow-docker_public" with driver "bridge"
Creating mlflow-docker_s3_1          ... done
Creating mlflow_db          ... done
Creating mlflow-docker_wait-for-db_1       ... done
Creating mlflow-docker_create_s3_buckets_1 ... done

ERROR: for mlflow  Container "477acbc097fb" exited with code 124.
ERROR: Encountered errors while bringing up the project.

Minio server works good but MLFlow is not working at all

@Toumash
Copy link
Owner

Toumash commented Jan 3, 2023

hi @helloerikaaa! Thanks for bringing this up.
Please send me your logs from docker compose logs. and docker compose logs mlflow.

Also please try to run it one more time:

docker compose down
docker compose up --build

I dont have a raspberry to test it there, but running docker compose on a fresh machine with windows 10 looks like ⬇.

PS C:\repo\mlflow-docker> docker compose logs mlflow
tracker_mlflow  | 2023/01/03 10:00:31 INFO mlflow.store.db.utils: Creating initial MLflow database tables...
tracker_mlflow  | 2023/01/03 10:00:31 INFO mlflow.store.db.utils: Updating database tables
tracker_mlflow  | INFO  [alembic.runtime.migration] Context impl MySQLImpl.
tracker_mlflow  | INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
tracker_mlflow  | INFO  [alembic.runtime.migration] Running upgrade  -> 451aebb31d03, add metric step
tracker_mlflow  | INFO  [alembic.runtime.migration] Running upgrade 451aebb31d03 -> 90e64c465722, migrate user column to tags
tracker_mlflow  | INFO  [alembic.runtime.migration] Running upgrade 90e64c465722 -> 181f10493468, allow nulls for metric values
tracker_mlflow  | INFO  [alembic.runtime.migration] Running upgrade 181f10493468 -> df50e92ffc5e, Add Experiment Tags Table
tracker_mlflow  | INFO  [alembic.runtime.migration] Running upgrade df50e92ffc5e -> 7ac759974ad8, Update run tags with larger limit
tracker_mlflow  | INFO  [alembic.runtime.migration] Running upgrade 7ac759974ad8 -> 89d4b8295536, create latest metrics table
tracker_mlflow  | INFO  [89d4b8295536_create_latest_metrics_table_py] Migration complete!
tracker_mlflow  | INFO  [alembic.runtime.migration] Running upgrade 89d4b8295536 -> 2b4d017a5e9b, add model registry tables to db
tracker_mlflow  | INFO  [2b4d017a5e9b_add_model_registry_tables_to_db_py] Adding registered_models and model_versions tables to database.
tracker_mlflow  | INFO  [2b4d017a5e9b_add_model_registry_tables_to_db_py] Migration complete!
tracker_mlflow  | INFO  [alembic.runtime.migration] Running upgrade 2b4d017a5e9b -> cfd24bdc0731, Update run status constraint with killed
tracker_mlflow  | INFO  [alembic.runtime.migration] Running upgrade cfd24bdc0731 -> 0a8213491aaa, drop_duplicate_killed_constraint
tracker_mlflow  | WARNI [0a8213491aaa_drop_duplicate_killed_constraint_py] Failed to drop check constraint. Dropping check constraints may not be supported 
by your SQL database. Exception content: (pymysql.err.ProgrammingError) (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'CHECK status' at line 1")
tracker_mlflow  | [SQL: ALTER TABLE runs DROP CHECK status]
tracker_mlflow  | (Background on this error at: https://sqlalche.me/e/14/f405)
tracker_mlflow  | INFO  [alembic.runtime.migration] Running upgrade 0a8213491aaa -> 728d730b5ebd, add registered model tags table
tracker_mlflow  | INFO  [alembic.runtime.migration] Running upgrade 728d730b5ebd -> 27a6a02d2cf1, add model version tags table
tracker_mlflow  | INFO  [alembic.runtime.migration] Running upgrade 27a6a02d2cf1 -> 84291f40a231, add run_link to model_version
tracker_mlflow  | INFO  [alembic.runtime.migration] Running upgrade 84291f40a231 -> a8c4a736bde6, allow nulls for run_id
tracker_mlflow  | INFO  [alembic.runtime.migration] Running upgrade a8c4a736bde6 -> 39d1c3be5f05, add_is_nan_constraint_for_metrics_tables_if_necessary     
tracker_mlflow  | INFO  [alembic.runtime.migration] Running upgrade 39d1c3be5f05 -> c48cb773bb87, reset_default_value_for_is_nan_in_metrics_table_for_mysql 
tracker_mlflow  | INFO  [alembic.runtime.migration] Running upgrade c48cb773bb87 -> bd07f7e963c5, create index on run_uuid
tracker_mlflow  | INFO  [alembic.runtime.migration] Running upgrade bd07f7e963c5 -> 0c779009ac13, add deleted_time field to runs table
tracker_mlflow  | INFO  [alembic.runtime.migration] Running upgrade 0c779009ac13 -> cc1f77228345, change param value length to 500
tracker_mlflow  | INFO  [alembic.runtime.migration] Running upgrade cc1f77228345 -> 97727af70f4d, Add creation_time and last_update_time to experiments table
tracker_mlflow  | INFO  [alembic.runtime.migration] Context impl MySQLImpl.
tracker_mlflow  | INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
tracker_mlflow  | [2023-01-03 10:00:33 +0000] [23] [INFO] Starting gunicorn 20.1.0
tracker_mlflow  | [2023-01-03 10:00:33 +0000] [23] [INFO] Listening at: http://0.0.0.0:5000 (23)
tracker_mlflow  | [2023-01-03 10:00:33 +0000] [23] [INFO] Using worker: sync
tracker_mlflow  | [2023-01-03 10:00:33 +0000] [24] [INFO] Booting worker with pid: 24
tracker_mlflow  | [2023-01-03 10:00:33 +0000] [25] [INFO] Booting worker with pid: 25
tracker_mlflow  | [2023-01-03 10:00:33 +0000] [26] [INFO] Booting worker with pid: 26
tracker_mlflow  | [2023-01-03 10:00:33 +0000] [27] [INFO] Booting worker with pid: 27

@helloerikaaa
Copy link
Author

hi @Toumash thank you for your reply.
When doing docker compose up --build the message I get is the following:

Attaching to mlflow-docker-create_s3_buckets-1, mlflow-docker-s3-1, mlflow-docker-wait-for-db-1, mlflow_db, tracker_mlflow
mlflow_db                          | exec /entrypoint.sh: exec format error
mlflow-docker-wait-for-db-1        | 2023-01-03T19:34:02Z INF [TCP] Checking the db:3306 ...
mlflow-docker-create_s3_buckets-1  | Added `minio` successfully.
mlflow-docker-create_s3_buckets-1  | mc: <ERROR> Unable to make bucket `minio/mlflow`. Your previous request to create the named bucket succeeded and you already own it.
mlflow_db exited with code 0
mlflow-docker-wait-for-db-1        | 2023-01-03T19:34:05Z ERR Expectation failed error="timed out while making a tcp call, caused by: dial tcp 172.18.0.2:3306: i/o timeout" timeout=3s
mlflow-docker-wait-for-db-1        | 2023-01-03T19:34:05Z INF [TCP] Checking the db:3306 ...
mlflow-docker-wait-for-db-1        | 2023-01-03T19:34:05Z ERR Expectation failed error="failed to establish a tcp connection, caused by: dial tcp: lookup db on 127.0.0.11:53: server misbehaving"
mlflow-docker-wait-for-db-1        | 2023-01-03T19:34:05Z INF [TCP] Checking the db:3306 ...
mlflow-docker-wait-for-db-1        | 2023-01-03T19:34:05Z ERR Expectation failed error="failed to establish a tcp connection, caused by: dial tcp: lookup db on 127.0.0.11:53: server misbehaving"
mlflow-docker-create_s3_buckets-1 exited with code 0
mlflow-docker-wait-for-db-1        | 2023-01-03T19:34:06Z INF [TCP] Checking the db:3306 ...
mlflow_db exited with code 0
mlflow_db exited with code 1
mlflow_db exited with code 1
mlflow-docker-wait-for-db-1        | 2023-01-03T19:34:09Z ERR Expectation failed error="timed out while making a tcp call, caused by: dial tcp 172.18.0.2:3306: i/o timeout" timeout=3s
mlflow_db exited with code 1
mlflow-docker-wait-for-db-1        | 2023-01-03T19:34:09Z INF [TCP] Checking the db:3306 ...
mlflow-docker-wait-for-db-1        | 2023-01-03T19:34:09Z ERR Expectation failed error="failed to establish a tcp connection, caused by: dial tcp: lookup db on 127.0.0.11:53: server misbehaving"
mlflow-docker-wait-for-db-1        | 2023-01-03T19:34:09Z INF [TCP] Checking the db:3306 ...
mlflow-docker-wait-for-db-1        | 2023-01-03T19:34:09Z ERR Expectation failed error="failed to establish a tcp connection, caused by: dial tcp: lookup db on 127.0.0.11:53: server misbehaving"
mlflow-docker-wait-for-db-1        | 2023-01-03T19:34:09Z INF [TCP] Checking the db:3306 ...
mlflow-docker-wait-for-db-1        | 2023-01-03T19:34:09Z ERR Expectation failed error="failed to establish a tcp connection, caused by: dial tcp: lookup db on 127.0.0.11:53: server misbehaving"
mlflow-docker-wait-for-db-1        | 2023-01-03T19:34:10Z INF [TCP] Checking the db:3306 ...

It's the same log if I use the command docker compose logs, if I try to use docker compose logs mlflow it returns nothing.

@Toumash
Copy link
Owner

Toumash commented Jan 5, 2023

So the root problem is with the database not booting.

I've googled the issue and it looks like it could be the problem with mysql not being compatible with your processor's architecture which is ARM.
I've found a project that compiled mysql 5.5 onto arm platform. I would suggest to give it a try. Try to swap

    image: mysql/mysql-server:5.7.28

with

    image: hypriot/rpi-mysql

The other thing i've found out is that the mysql now has support for arm v8 in the latest, 8 version

image: mysql:8-oracle

Source: https://hub.docker.com/layers/library/mysql/8-oracle/images/sha256-cfddf275c8b1ae1583c0f6afb4899d4dbe14111a6462699559a1f4dc8f4d5f6e?context=explore.

@helloerikaaa could you please check if the following docker-compose.yml works for you? Its starting up ok, but i cant see to get mlflow/python/conda working on my windows machine to run python ./quickstart/mlflow_tracking.py

version: "3.9"
services:
  s3:
    image: minio/minio:RELEASE.2021-11-24T23-19-33Z
    restart: unless-stopped
    ports:
      - "9000:9000"
      - "9001:9001"
    environment:
      - MINIO_ROOT_USER=${AWS_ACCESS_KEY_ID}
      - MINIO_ROOT_PASSWORD=${AWS_SECRET_ACCESS_KEY}
    command: server /data --console-address ":9001"
    networks:
      - internal
      - public
    volumes:
      - minio_volume:/data
  db:
    image: mysql:8-oracle
    restart: unless-stopped
    container_name: mlflow_db
    expose:
      - "3306"
    environment:
      - MYSQL_DATABASE=${MYSQL_DATABASE}
      - MYSQL_USER=${MYSQL_USER}
      - MYSQL_PASSWORD=${MYSQL_PASSWORD}
      - MYSQL_ROOT_PASSWORD=${MYSQL_ROOT_PASSWORD}
    volumes:
      - db_volume:/var/lib/mysql
    networks:
      - internal
  mlflow:
    container_name: tracker_mlflow
    image: tracker_ml
    restart: unless-stopped
    build:
      context: ./mlflow
      dockerfile: Dockerfile
    ports:
      - "5000:5000"
    environment:
      - AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
      - AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
      - AWS_DEFAULT_REGION=${AWS_REGION}
      - MLFLOW_S3_ENDPOINT_URL=http://s3:9000
    networks:
      - public
      - internal
    entrypoint: mlflow server --backend-store-uri mysql+pymysql://${MYSQL_USER}:${MYSQL_PASSWORD}@db:3306/${MYSQL_DATABASE} --default-artifact-root s3://${AWS_BUCKET_NAME}/ --artifacts-destination s3://${AWS_BUCKET_NAME}/ -h 0.0.0.0
    depends_on:
      wait-for-db:
        condition: service_completed_successfully
  create_s3_buckets:
    image: minio/mc
    depends_on:
      - "s3"
    entrypoint: >
      /bin/sh -c "
      until (/usr/bin/mc alias set minio http://s3:9000 '${AWS_ACCESS_KEY_ID}' '${AWS_SECRET_ACCESS_KEY}') do echo '...waiting...' && sleep 1; done;
      /usr/bin/mc mb minio/${AWS_BUCKET_NAME};
      exit 0;
      "
    networks:
      - internal
  wait-for-db:
    image: atkrad/wait4x
    depends_on:
      - db
    command: tcp db:3306 -t 90s -i 250ms
    networks:
      - internal
networks:
  internal:
  public:
    driver: bridge
volumes:
  db_volume:
  minio_volume:

@Toumash Toumash changed the title ERROR: for mlflow Container "477acbc097fb" exited with code 124. Cannot start mysql container on arm v8 Jan 5, 2023
@Toumash Toumash closed this as completed Mar 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants