Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could not fetch models meta information #7711

Closed
2 tasks done
antortjim opened this issue Apr 2, 2024 · 7 comments
Closed
2 tasks done

Could not fetch models meta information #7711

antortjim opened this issue Apr 2, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@antortjim
Copy link

antortjim commented Apr 2, 2024

Actions before raising this issue

  • I searched the existing issues and did not find anything similar.
  • I read/searched the docs

Steps to Reproduce

git clone https://github.com/opencv/cvat
cd cvat
git checkout v2.10.1
docker-compose up -d

Add to .bashrc

export CVAT_HOST="0.0.0.0"
export CVAT_USERNAME="foo"
export CVAT_PASSWORD="bar"

I have been using CVAT without troubles for a couple of months. But this week I accidentally took my root disk partition above 90% usage (I had < 10 % of the disk available for more data), and cvat stops working then. I cleaned up to make more space and restarted cvat by running

cd cvat
docker-compose down
# wait a few seconds
docker-compose up-d

Then I open CVAT at 0.0.0.0:8080 and I get the errors:

image
image

I don't know if the error is caused by the disk issue I mentioned above, but it could be.

Expected Behavior

I expect CVAT to throw no warnings

Possible Solution

Updating to the latest (v2.11.2)?
Modifying the docker-compose.yml file as explained here?
I am afraid to try any of these solutions because they might break my database. But I can try them if someone kindly confirms for example adding a new volume would fix it. Your help would be very appreciated 🙏

Context

This issue is preventing me from using the cvat-cli. Now I get this error when I run it

cat cvat_images_index.txt | xargs -s 999999 cvat-cli --insecure --debug --auth foo:bar --server-host http://0.0.0.0 --server-port 8080 create task_name --segment_size 100 --labels labels.json local
[2024-04-02 09:34:41] CRITICAL: Status Code: 500
Reason: Internal Server Error
HTTP response headers: HTTPHeaderDict({'Content-Length': '145', 'Content-Type': 'text/html; charset=utf-8', 'Cross-Origin-Opener-Policy': 'same-origin', 'Date': 'Tue, 02 Apr 2024 09:34:41 GMT', 'Referrer-Policy': 'same-origin', 'Server': 'nginx/1.18.0 (Ubuntu)', 'Vary': 'Origin', 'X-Content-Type-Options': 'nosniff', 'X-Frame-Options': 'DENY', 'X-Request-Id': '6f80be6a-d8bf-429e-91b8-76c17aa19383'})
HTTP response body: 
<!doctype html>
<html lang="en">
<head>
  <title>Server Error (500)</title>
</head>
<body>
  <h1>Server Error (500)</h1><p></p>
</body>
</html>

before it was fine

Environment

Git hash (git log -1)

commit a33f7f57088744bab61f18e8a8cf6528a0c22fd2 (HEAD -> master, tag: v2.10.1, origin/master)
Merge: d66d043e2 c21062f5f
Author: cvat-bot[bot] <147643061+cvat-bot[bot]@users.noreply.github.com>
Date:   Thu Jan 18 11:10:58 2024 +0000

    Merge pull request #7372 from opencv/release-2.10.1
    
    Release v2.10.1

OS: Ubuntu 22.04.2 LTS
Docker version 20.10.21, build 20.10.21-0ubuntu1~22.04.3

I generated this log using docker logs cvat_server > cvat.log as per this comment

cvat.log

@antortjim antortjim added the bug Something isn't working label Apr 2, 2024
@bsekachev
Copy link
Member

The problem is:

Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/asgiref/sync.py", line 534, in thread_handler
    raise exc_info[1]
  File "/opt/venv/lib/python3.10/site-packages/django/core/handlers/exception.py", line 42, in inner
    response = await get_response(request)
  File "/opt/venv/lib/python3.10/site-packages/django/core/handlers/base.py", line 253, in _get_response_async
    response = await wrapped_callback(
  File "/opt/venv/lib/python3.10/site-packages/asgiref/sync.py", line 479, in __call__
    ret: _R = await loop.run_in_executor(
  File "/opt/venv/lib/python3.10/site-packages/asgiref/current_thread_executor.py", line 40, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/opt/venv/lib/python3.10/site-packages/asgiref/sync.py", line 538, in thread_handler
    return func(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/django/views/decorators/csrf.py", line 56, in wrapper_view
    return view_func(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/rest_framework/viewsets.py", line 125, in view
    return self.dispatch(request, *args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/rest_framework/views.py", line 509, in dispatch
    response = self.handle_exception(exc)
  File "/opt/venv/lib/python3.10/site-packages/rest_framework/views.py", line 469, in handle_exception
    self.raise_uncaught_exception(exc)
  File "/opt/venv/lib/python3.10/site-packages/rest_framework/views.py", line 480, in raise_uncaught_exception
    raise exc
  File "/opt/venv/lib/python3.10/site-packages/rest_framework/views.py", line 506, in dispatch
    response = handler(request, *args, **kwargs)
  File "/home/django/cvat/apps/lambda_manager/views.py", line 983, in func_wrapper
    data = func(*args, **kwargs)
  File "/home/django/cvat/apps/lambda_manager/views.py", line 1140, in list
    rq_jobs = [job.to_dict() for job in queue.get_jobs() if job.get_task() in task_ids]
  File "/home/django/cvat/apps/lambda_manager/views.py", line 522, in get_jobs
    job_ids = set(queue.get_job_ids() +
  File "/opt/venv/lib/python3.10/site-packages/rq/queue.py", line 378, in get_job_ids
    job_ids = [as_text(job_id) for job_id in self.connection.lrange(self.key, start, end)]
  File "/opt/venv/lib/python3.10/site-packages/redis/commands/core.py", line 2715, in lrange
    return self.execute_command("LRANGE", name, start, end)
  File "/opt/venv/lib/python3.10/site-packages/redis/client.py", line 1255, in execute_command
    conn = self.connection or pool.get_connection(command_name, **options)
  File "/opt/venv/lib/python3.10/site-packages/redis/connection.py", line 1442, in get_connection
    connection.connect()
  File "/opt/venv/lib/python3.10/site-packages/redis/connection.py", line 704, in connect
    raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error -3 connecting to cvat_redis_inmem:6379. Temporary failure in name resolution.

Be sure that cvat_redis_inmem container is healthy.

@antortjim
Copy link
Author

antortjim commented Apr 2, 2024

You are right, docker ps shows:

CONTAINER ID   IMAGE                                       COMMAND                  CREATED        STATUS                         PORTS                                                                                          NAMES
a0bb9292b4c7   cvat/ui:v2.10.1                             "/docker-entrypoint.…"   17 hours ago   Up 17 hours                    80/tcp                                                                                         cvat_ui
2f0b8290134d   timberio/vector:0.26.0-alpine               "/usr/local/bin/vect…"   17 hours ago   Up 17 hours                                                                                                                   cvat_vector
7bb0f5467e7e   cvat/server:v2.10.1                         "./backend_entrypoin…"   17 hours ago   Up 17 hours                    8080/tcp                                                                                       cvat_server
e6b1fa01c959   cvat/server:v2.10.1                         "./backend_entrypoin…"   17 hours ago   Up 17 hours                    8080/tcp                                                                                       cvat_worker_import
c6cbcff7dbba   cvat/server:v2.10.1                         "./backend_entrypoin…"   17 hours ago   Up 17 hours                    8080/tcp                                                                                       cvat_worker_analytics_reports
46c4ac8fd87a   cvat/server:v2.10.1                         "./backend_entrypoin…"   17 hours ago   Up 17 hours                    8080/tcp                                                                                       cvat_utils
4897e0aac9ff   cvat/server:v2.10.1                         "./backend_entrypoin…"   17 hours ago   Up 17 hours                    8080/tcp                                                                                       cvat_worker_webhooks
057b1c1507a7   cvat/server:v2.10.1                         "./backend_entrypoin…"   17 hours ago   Up 17 hours                    8080/tcp                                                                                       cvat_worker_annotation
082355426ca7   cvat/server:v2.10.1                         "./backend_entrypoin…"   17 hours ago   Up 17 hours                    8080/tcp                                                                                       cvat_worker_quality_reports
c85a7b66c845   cvat/server:v2.10.1                         "./backend_entrypoin…"   17 hours ago   Up 17 hours                    8080/tcp                                                                                       cvat_worker_export
c082c0ebe7ca   redis:7.2.3-alpine                          "docker-entrypoint.s…"   17 hours ago   Restarting (1) 3 seconds ago                                                                                                  cvat_redis_inmem
2bbaf17450bc   traefik:v2.10                               "/entrypoint.sh trae…"   17 hours ago   Up 17 hours                    0.0.0.0:8080->8080/tcp, :::8080->8080/tcp, 80/tcp, 0.0.0.0:8090->8090/tcp, :::8090->8090/tcp   traefik
90a095d72c64   postgres:15-alpine                          "docker-entrypoint.s…"   17 hours ago   Up 17 hours                    5432/tcp                                                                                       cvat_db
1370b63e70b9   openpolicyagent/opa:0.45.0-rootless         "/opa run --server -…"   17 hours ago   Up 17 hours                                                                                                                   cvat_opa
a2625182f2e6   apache/kvrocks:2.7.0                        "kvrocks -c /var/lib…"   17 hours ago   Up 17 hours (healthy)          6666/tcp                                                                                       cvat_redis_ondisk
e62f8d87ef38   clickhouse/clickhouse-server:23.11-alpine   "/entrypoint.sh"         17 hours ago   Up 17 hours                    8123/tcp, 9000/tcp, 9009/tcp                                                                   cvat_clickhouse

So I ran docker logs cvat_redis_inmem > cvat_redis_inmem.log
and the logs contain the same block over and over again:

1:C 01 Apr 2024 20:09:51.839 # WARNING Memory overcommit must be enabled! Without it, a background save or replication may fail under low memory condition. Being disabled, it can also cause failures without low memory condition, see https://github.com/jemalloc/jemalloc/issues/1328. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
1:C 01 Apr 2024 20:09:51.839 * oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 01 Apr 2024 20:09:51.839 * Redis version=7.2.3, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 01 Apr 2024 20:09:51.839 * Configuration loaded
1:M 01 Apr 2024 20:09:51.840 * monotonic clock: POSIX clock_gettime
1:M 01 Apr 2024 20:09:51.841 * Running mode=standalone, port=6379.
1:M 01 Apr 2024 20:09:51.842 * Server initialized
1:M 01 Apr 2024 20:09:51.843 * Reading RDB base file on AOF loading...
1:M 01 Apr 2024 20:09:51.843 * Loading RDB produced by version 7.2.3
1:M 01 Apr 2024 20:09:51.843 * RDB age 3031639 seconds
1:M 01 Apr 2024 20:09:51.843 * RDB memory usage when created 1.48 Mb
1:M 01 Apr 2024 20:09:51.843 * RDB is base AOF
1:M 01 Apr 2024 20:09:51.843 * Done loading RDB, keys loaded: 69, keys expired: 0.
1:M 01 Apr 2024 20:09:51.843 * DB loaded from base file appendonly.aof.2.base.rdb: 0.001 seconds
1:M 01 Apr 2024 20:09:52.705 # Bad file format reading the append only file appendonly.aof.2.incr.aof: make a backup of your AOF file, then use ./redis-check-aof --fix <filename.manifest>

I think the problem is exposed in the last line, but given that info, I don't know what to do next

@bsekachev
Copy link
Member

bsekachev commented Apr 2, 2024

Try these steps:

  1. docker compose down
  2. Remove docker volume that stores inmem cache (if you send me output of docker volume ls, I will tell you which one)
  3. docker compose up

@bsekachev
Copy link
Member

bsekachev commented Apr 2, 2024

Also, we dropped KeyDB (solution to handle ondisk cache) because it did not behave stable and caused a lot of issues for us.
Currently it was replaced with KVRocks in this PR #7339 (starting from v2.10.1)

@antortjim
Copy link
Author

antortjim commented Apr 2, 2024

Hi @bsekachev thank you for your kind help! I actually have CVAT version 2.10.1 (that's what you refer to in your comment before right?). Here comes the otput of docker volume ls

DRIVER    VOLUME NAME
local     0ffd0c2529618ce55b105af4be95022738fff10844dd4320a9ad95b2300c5659
local     0001e018d53cd103d343d31e3f3836b2fbc5927402141d3ec87720de11d05492
local     2af79d9fbe9b3ef52f568b39cd5902123030bf9dca10e6dee88aaa6c46035fad
local     2b7cb1e5f78c01bf0c5ff3926173979f758e049550aec7bc180b569000610100
local     4f1fe6c67bbda50648ab9b32c38305acbc00e6e207b47fdad2f5cdc3489d3f68
local     5f5bb5d9588152f17d7bb1c453e6354a2b994a1e0cf1494209cc9cb44e0bad76
local     9bcf49b4809970f6020afc95b144ca9e958213bdf37e5576e74908706a81c951
local     10d01577d10d553d66a0dcd50f1d0e60ec0ab0bc354aa27e9265a1c7d2811550
local     16e2078d512923ea638ce74b642ff1d3736274af89a7dda1a41e09f2d247fb43
local     25c1a3b16c74c428716c5d292e790db648f5f22d761797faac836d4f29fb91d1
local     40e959f78f7a9b187e02c3e97831f23794b0ea2fea297b2044cc8b09c32f2a5b
local     49b67cada9fef48b95b179c51b77fc869e107642c27b657a348e0ec093e438f1
local     52faa6891857de75a22846052c56382c3ad64880a3a0f8ddf7ae15bcd96619fa
local     53d6036663086e3c7f429fa2eda612aa666cf6b3dcf9239dabc43a8232bdb612
local     54feeb20e1804086b732a453ab28d3dac6c9921fb6ba925bb288b5ba3e6a9e81
local     81e2a0d3994e59809ad2ed07e53a021c779f1a62a632c5d3bb7e9d82c31ddc46
local     278dab357cc3f8fb76d1176728eb4d8aa3b9971846266ac890088287293db927
local     741b969c1ba22ced869fffafdc4eedaac9af554294df4b1d57d079dc9cfd091a
local     979cdf91868adc5d9d4eefc749cd00e0b1cbeb2c4286b2ae4e990241030f1ca1
local     4009c83475887b240d4b22da400b70136309f9a34f54ef77779e54f8cac5bd60
local     25412fc498d40af17934918c782c6c9c3e3619b413ed29f90de369bc5ce8c5e2
local     39186f01083b19389767225dbca898258823062d8115c29ab17f268e530b98fc
local     307920d19fb3081069b4800cdd7ef9f7fe8ab3e012b18b3dd99622c20f6207ff
local     6666814b362b1facba26d1c875f9ab7be82ec8ec43dc7f998466d9a350ae9358
local     a05587fe37d20a0e539c3d5d0ac96ecf5154f45f5bc317edbdeae6f10fae5269
local     a429658459e4ec5291b809078de99690076ae2c51f09175c29ed6dbb337c564b
local     ba25698a34074d3ac006227998acebb40e1178bf116e3fca282edf6c2c2142ee
local     c8b7a6b80b5b278e275532629800aa1f50262e2a944f3a7063b1eef2fa285766
local     cvat_cvat_cache_db
local     cvat_cvat_data
local     cvat_cvat_db
local     cvat_cvat_events_db
local     cvat_cvat_inmem_db
local     cvat_cvat_keys
local     cvat_cvat_logs
local     d5494c9b22dc8734f59c24215b4722edd05c665eed104f92b79596c6a56db147
local     db4809cea0888dc9155dbb5b671fccd6230bde03424b58585d499efa44714500
local     dcdcf2cf81a612d2a47e30f410f9590ee357a5478b3b49baab5b7d3c88b0759d
local     dd0cf0712fd3f38fa7f59e3118eb67dd58ffcda3741abc45e995d649af006fdf
local     e3da743aa9496208c2c8d6015a6a24d5fe174710de2563611af658f5266b365d
local     ece48135f656716fb87ebb66f2c0b73d53581854e2a4debbbe2242bcf96c0474
local     f6c2cf0c20889574c9f5f9a8338c663ce5de143e830f8c074a7458327b8f9fd7

@bsekachev
Copy link
Member

bsekachev commented Apr 2, 2024

Hi @bsekachev thank you for your kind help! I actually have CVAT version 2.10.1

Yep, I just noticed that. Try:

docker volume rm cvat_cvat_inmem_db

@antortjim
Copy link
Author

antortjim commented Apr 2, 2024

The warning messages are gone and I can use again cvat-cli. Thank you so much! The problem is solved with this (for potential future readers):

cd cvat
docker-compose down
# wait
docker volume rm cvat_cvat_inmem_db
docker-compose up -d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants