Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIngle Container version of Compreface #651

Open
alexdelprete opened this issue Oct 31, 2021 · 99 comments
Open

SIngle Container version of Compreface #651

alexdelprete opened this issue Oct 31, 2021 · 99 comments

Comments

@alexdelprete
Copy link

Is your feature request related to a problem? Please describe.
We're using Compreface as the recog engine of Double Take (an Home Assistant addon): https://github.com/jakowenko/double-take

Double Take dev (@jakowenko) created the addons for Double Take and other recognition frameworks: DeepStack and Facebox, but our preferred solution is Compreface. Unfortunately we can only build an addon starting from a single container, while Compreface requires five separate containers, unlike DeepStack and Facebox.

Describe the solution you'd like
A single-container build of Compreface, so we can create an Home Assistant addon for Double Take users, that right now are using DeepStack only because there's an addon available, while we recommend Compreface for best recognition results.

@pospielov
Copy link
Collaborator

I don't think that running CompreFace in one image is a good idea.
Now if something wrong - you can just restart one of the services, while others continue to work.
But I got the idea - it should make it much easier to run with Home Assistant.
I'll add it to our backlog and try to make it as soon as we have resources for it.

I haven't used HA, but I see that it has another thing - integrations. In this case, you have to run CompreFace himself, but there is still an integration with HA. Will it work for you?

@alexdelprete
Copy link
Author

alexdelprete commented Nov 2, 2021

I agree with you, but in order to use CompreFace as an addon in HassOS, we need a single container, it's a pre-requirement in order to have a supported addon in the HA environment. And I'm not asking this just for myself, there are literally hundreds of users that are using Double Take + Frigate to do face recognition automation, and unfortunately they are forced to use DeepStack now because it's the only addon available, while advanced users are installing CompreFace manually, even if HA doesn't support it, because of its superior recognition capabilities.

We are recommending users to use CompreFace but the feedback is that it's too hard to install. So we need to develop the addon to install it directly in HA, like it has been done for DeepStack. It's a pity these users have to revert to an inferior solution like DeepStack, they deserve CompreFace, I hope you can understand this and help us reach the objective.

As you can see here, DoubleTake is the only addon that @jakowenko has been able to implement, we are waiting for a single container version of CompreFace now so users can have the better choice.

image

I haven't used HA, but I see that it has another thing - integrations. In this case, you have to run CompreFace himself, but there is still an integration with HA. Will it work for you?

CompreFace addon will be used as an engine by the Double Take addon, right now we are using CompreFace in an unsupported way manually installing it into HA's docker. Once installed, Double Take gets the snapshots from Frigate, and passes them to CompreFace for recognition, then it processes the result and creates HA sensors to do the automations. It's working beautifully but this kind of setup is unsupported unfortunately so users are reverting to DeepStack. :(

Thanks for your reply and I hope you can deliver the single container version as soon as possible. :)

@bigbangus
Copy link

I second this also for the Unraid community. It would be ideal to have it in a single container that can pulled from the App store template. Currently it's a steep learning curve and requires the "compose.manager" plugin to work. So the majority of users will default to deepstack with reportedly poor recognition performance when pairing frigate with double take for facial recognition on rtsp feeds.

@alexdelprete
Copy link
Author

alexdelprete commented Nov 16, 2021

So the majority of users will default to deepstack with reportedly poor recognition performance when pairing frigate with double take for facial recognition on rtsp feeds.

That's the main problem we're facing now for the Double Take project: 90% of support questions are for Deepstack installation and poor recognition rate.

We need to switch them to CompreFace, but we need a single container version to be able to do that. I hope @pospielov can give us an idea of the time frame for this version to be available...

@pospielov
Copy link
Collaborator

I am working on it right now and already have the first working version. There are still some problems. In an optimistic scenario, I'll publish it this week and write a short description of how to run it here. Normal readme I'll write a little bit later.

@alexdelprete
Copy link
Author

I am working on it right now and already have the first working version. There are still some problems. In an optimistic scenario, I'll publish it this week and write a short description of how to run it here

Amazing news Pospielov. Thank you so much. Users will be happy to switch from DeepStack to CompreFace, like we suggested.

Please let me know when it's ready so we can test it. Thank you.

@bigbangus
Copy link

I am working on it right now and already have the first working version. There are still some problems. In an optimistic scenario, I'll publish it this week and write a short description of how to run it here. Normal readme I'll write a little bit later.

Thank you. A lot of Unraid users have an Nvidia GPU for containers like plex so if you could include a method to pass extra parameters for the GPU custom builds also that would be ideal. Again thank you for your effort and time.

@Iceman248
Copy link

Thanks for the work on this. We greatly appreciate it and look forward to the end result.

@corgan2222
Copy link

corgan2222 commented Nov 18, 2021

I am working on it right now and already have the first working version. There are still some problems. In an optimistic scenario, I'll publish it this week and write a short description of how to run it here. Normal readme I'll write a little bit later.

Awesome, that's great News!

I have written the unraid docker templates for double-take and facebox, but because unraid did not support docker-compose the users are stuck to a command-line installation of compreface, which isn't ideal for the average user.

As @bigbangus mentioned, if you could add support for GPU processing, it would be awesome, but not mandatory.
So just comment here, or hit me directly, if you have something to test.
I would then transfer your docker stuff into an unraid app template for the Community Store.

@pospielov
Copy link
Collaborator

Hi all, it looks like I managed to make a single image of CompreFace
How to run:
docker run -it --name=CompreFace -v compreface-db:/var/lib/postgresql/data -p 8000:80 exadel/compreface:0.6.1
where:
name=CompreFace - is the name of container
compreface-db - is the name of the volume. This is important to keep the data if you delete the container.
Under the hood is used supervisord. It helps to achieve similar behavior with docker-compose - if the service fails, it automatically starts it.
Some other notices:

  1. Compared to docker-compose we can't guarantee that DB starts before servers. I added 10 seconds of timeout between the start of DB and the first service.
  2. If DB starts too slow - service will fail. But supervisord will restart it. So in the bad scenario, you will see lots of errors in logs, but it will work
  3. In the worst scenario it one of the services won't start, but supervisord will endlessly try to start it. You will see the same error in the console. In this case, write the log to me.
  4. Single container version of CompreFace isn't tested by a standard QA process. On another hand, I reuse the images from DockerHub. So basically the only difference is with the deploy method.
  5. As I reuse the images from DockerHub, I managed to make all official versions of Compreface: FaceNet(default), Arcface-R100, Arcface-R100-GPU, MobileNet, MobileNet-GPU
    To run the GPU version, you need to add --runtime=nvidia:
    docker run -it --name=CompreFace -v compreface-db:/var/lib/postgresql/data --runtime=nvidia -p 8000:80 exadel/compreface:0.6.1-arcface-r100-gpu
  6. The environment variables are still actual from docker-compose version, so to set API server limit you can run:
    docker run -it --name=test -e "API_JAVA_OPTS=-Xmx8g" -v compreface-test-db:/var/lib/postgresql/data --runtime=nvidia -p 8000:80 exadel/compreface:0.6.1-mobilenet-gpu

@alexdelprete
Copy link
Author

Pospielov, thank you so much, we will test and let you know in case of problems. @jakowenko will try to build an addon for Home Assistant out of this.

@Iceman248
Copy link

Thanks. I'll test once an unRAID app template is done. I look forward to that. 😀

@corgan2222
Copy link

Awesome! Will test this on Unraid!

@corgan2222
Copy link

@pospielov

Thanks for this container! Runs great on Unraid with default FaceNet and arcface-r100-gpu.
But I have problems with the volume.

If I start the container without the volume, the /var/lib/postgresql/data Folder has the correct permission.
grafik

But if I configure the volume path, with

Unraid Docker Command:
/usr/local/emhttp/plugins/dynamix.docker.manager/scripts/docker create --name='CompreFace' --net='bridge' --privileged=true -e TZ="Europe/Berlin" -e HOST_OS="Unraid" -p '8800:80/tcp' -v '/mnt/user/appdata/compreface':'/var/lib/postgresql/data':'rw' --runtime=nvidia 'exadel/compreface:0.6.1-arcface-r100-gpu'

I get these errors:

2021-11-20 04:04:05,942 INFO success: compreface-postgres-db entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
today at 04:04:05 2021-11-20 03:04:05.953 GMT [14] LOG: skipping missing configuration file "/var/lib/postgresql/data/postgresql.auto.conf"
today at 04:04:05 2021-11-20 03:04:05.953 UTC [14] FATAL: "/var/lib/postgresql/data" is not a valid data directory
today at 04:04:05 2021-11-20 03:04:05.953 UTC [14] DETAIL: File "/var/lib/postgresql/data/PG_VERSION" is missing.

021-11-20 03:06:59.579 GMT [45] LOG: skipping missing configuration file "/var/lib/postgresql/data/postgresql.auto.conf"
today at 04:06:59 2021-11-20 03:06:59.579 UTC [45] FATAL: data directory "/var/lib/postgresql/data" has invalid permissions
today at 04:06:59 2021-11-20 03:06:59.579 UTC [45] DETAIL: Permissions should be u=rwx (0700) or u=rwx,g=rx (0750).

and the permissions are set to 777.
Strange is, if i exec into the Container and set chmod to 755 I still get the same errors.

I tested the Container on several Linux machines, and it runs fine. So that must be a problem with the unraid docker system.
Any Ideas?

So ATM the Container is usable, but without a persistent data Folder.

@pospielov
Copy link
Collaborator

@corgan2222 in my example I used named volume:
-v compreface-db:/var/lib/postgresql/data
The default behavior of named volume is that when it first created the volume, docker copies all content of the image to the volume with all permissions. Docker in its documentation recommends using named volumes because of portability.

When you set the exact folder to mount, docker created a bind mounts:
-v '/mnt/user/appdata/compreface':'/var/lib/postgresql/data':'rw'
By default, docker does not copy image content into bind mounts. This is why the Postgres data folder is empty inside your container and Postgres can't start.

So I would follow docker recommendations and use named volumes. In case this is impossible or you still prefer using binding to a folder, there is a workaround for such case if you still want to mount the exact folder into the docker container:
You need to create a named bind mount:

docker volume create --driver local \
    --opt type=none \
    --opt device=/mnt/user/appdata/compreface \
    --opt o=bind \
    compreface-db

And then use it as named volume during start:
docker run -it --name=CompreFace -v compreface-db:/var/lib/postgresql/data -p 8000:80 exadel/compreface:0.6.1

@corgan2222
Copy link

corgan2222 commented Nov 21, 2021

@pospielov Thanks for clarification, now the behavior make sense.
The main problem is, that unraid user can't change the behavior of how the docker process works.
This is how all docker volumes in unraid looks:

grafik

We as App's Maintainer create only XML Templates. The Unraid GUI then creates the Docker Containers.
The average User did not even know that a docker run command exists.
This is how the Compreface XML looks.

GUI View:

grafik

I'm not that deep into the Unraid Docker System. Do you have an Idea left, or should I forward this to the Unraid DEvs?

@pospielov
Copy link
Collaborator

What if you in Host Path put value without / at the beginning? Like compreface-db
If it won't work, then probably this is intentional.
https://forums.unraid.net/topic/88920-named-volumes-vs-bind-mount/
I see that they don't recommend named volumes. I would like to put them with docker developers in one room and they argue with each other :)
So, if it won't work, this is a good idea to ask Unraid developers what is their best practices in this case.
I see one workaround - I can put a backup of all empty postgres data files somewhere and during startup check, if the /var/lib/postgresql/data folder is empty, then copy the backup files into it. But this looks like a hack, not a normal implementation.

@bigbangus
Copy link

Not sure if this helps regarding mounted volumes, but I'm running compreface now in Unraid with the modifications to docker-compose.yml below and it works fine. (not using app templates, but docker compose plugin)

@pospielov are you after performance when you decided to use a named volume instead of a bind mount?
I found this thread that supports this idea: https://stackoverflow.com/questions/64629569/docker-bind-mount-directory-vs-named-volume-performance-comparison
So maybe in Windows/MacOS, compreface would be quicker with named volumes? I don't know. It just seems that every docker container on Unraid (of which there are thousands) all use bind mounts to separate application from data.

version: '3.4'

###volumes:
###  postgres-data:

services:
  compreface-postgres-db:
    restart: always
    image: postgres:11.5
    restart: always
    container_name: "compreface-postgres-db"
    environment:
      - POSTGRES_USER=${postgres_username}
      - POSTGRES_PASSWORD=${postgres_password}
      - POSTGRES_DB=${postgres_db}
    volumes:
###      - postgres-data:/var/lib/postgresql/data
      - /mnt/user/appdata/compreface/postgres:/var/lib/postgresql/data

image

@pospielov
Copy link
Collaborator

@bigbangus CompreFace uses named volumes by default because this is the only way it can work for everyone.
I mean bind mounts require the path and it depends on the folder structure on our machine and the operations system.
So the only way to guarantee volume creating is using named volumes.

Of course, experienced users can change it if they want.
I don't expect serious performance issues during recognition with Compreface and bind mounts if you use double-take.
We cache all required data for recognition in the memory. So as double-take always uses one api-key, you can be sure that CompreFace won't read from the database each recognition request.
Of course, when you save a new example, CompreFace will save it to the database and sync the cache. So this may be a slow point. But I don't think this is a common operation.

Also, your docker-compose file adds even more questions. So you also use bind mounts. But in your case docker copied image content into the folder. I don't know why - probably docker-compose does something similar to what I showed as a workaround. Still, as I understand, the problem is that Unraid doesn't support it.

@corgan2222
Copy link

I don't know why - probably docker-compose does something similar to what I showed as a workaround. Still, as I understand, the problem is that Unraid doesn't support it.

Yes, that's also for me the main point.
I will look into this topic next weekend, as I'm on a working trip this week. Thanks for your help so far!

@bigbangus: the reason for this ticket and @pospielov work with a single container was to bring compreface to more users. I think only a few users would install docker.compose on the command line.

@bigbangus
Copy link

@bigbangus: the reason for this ticket and @pospielov work with a single container was to bring compreface to more users. I think only a few users would install docker.compose on the command line.

Yes 100% agree. It should be in a single container to reach a greater audience. I was just trying to show it works with binding when using docker compose so why can't it work that way with templates in Unraid. What's the actual problem?

@pospielov
Copy link
Collaborator

@corgan2222 Please contact Unraid team, ask what workaround they see in this case. I think this is not the first time somebody face this problem.
If they won't give any solution, then I'll try to come up with something in CompreFace image

@jakowenko
Copy link

Thank you so much @pospielov for getting this single container version working.

With the help of @bentasker and @alexdelprete we got the Home Assistant add-on working with persistent storage! I'm going to let the HA community know as well which should help drive traffic towards CompreFace.

Screen Shot 2021-11-27 at 1 29 24 PM

@alexdelprete
Copy link
Author

Thank you so much @pospielov for getting this single container version working.

With the help of @bentasker and @alexdelprete we got the Home Assistant add-on working with persistent storage! I'm going to let the HA community know as well which should help drive traffic towards CompreFace.

One month ago I told you we needed the CompreFace addon so users could have the best solution (I hate DeepStack). I started this thread, and here we are...sometimes magic happens. Thanks to @pospielov and @bentasker. :)

@alexdelprete
Copy link
Author

alexdelprete commented Nov 30, 2021

@pospielov a question: the addon is working fine, but some users let us notice that the addon is allocating 3-4GB of RAM at startup without releasing it. @bentasker that helped us creating the addon confirmed it's a java config setting. I checked in the container info and found this:

here's the allocated ram:

image

and we found these configuration env. variables:

image

we were wondering if we could modify API_JAVA_OPTS with a lower value than -Xmx4g.

please take into account that on average Home Assistant users are not running on very powerful systems with tons of RAM, so this might be a problem for an 8GB system with other apps/addons running. I hope we can fine tune the memory requirement.

Thanks for any suggestion. Tagging also @jakowenko so we're all aligned.

@pospielov
Copy link
Collaborator

@jakowenko I have one favor to ask, could you rename CompreFace to Exadel CompreFace in your addon?

@pospielov
Copy link
Collaborator

@alexdelprete Java caches in memory embedding for each subject example. In my experience, 50 000 examples need about 8Gb of RAM. So I think you can reduce this number. Java consumes all the memory it sees, but garbage collector is quite effective and we tested CompreFace for memory leaks, looks like everything ok.
Also, working with mobilenet custom build will help as well, as it works with vectors of 128 numbers, not 512.

@pospielov
Copy link
Collaborator

@corgan2222 Any news from Unraid?

@pospielov
Copy link
Collaborator

pospielov commented Dec 14, 2021

What version of postgres should we use on Unraid with compreface.

I tested with Postgres 11.5 and Postgres 13.5. Both worked fine.

@alexdelprete
Copy link
Author

Thank you Pospielov, so now we have the base image without the named volume and using a variable the user can decide if he wants to use the internal postgres or the external one. In case he wants to use the internal one, a volume has to be created, right? We'll have to provide some instructions regarding this in the addon readme.

Thanks a lot for making all this possible.

@pospielov
Copy link
Collaborator

pospielov commented Dec 17, 2021

In case he wants to use the internal one, a volume has to be created, right?

It can work without volume, but then the user will lose all the data if he deletes the volume

@alexdelprete
Copy link
Author

It can work without volume, but then the user will lose all the data if he deletes the volume

Obviously we'll recommend creating a persistent storage for data, so the creation of a volume.

@alexdelprete
Copy link
Author

alexdelprete commented Jan 10, 2022

@pospielov the tag exadel/compreface:0.6.1-mobilenet would pull the latest official single-image version with EXTERNAL_DB support, from the official exadel repository, right? Because I didn't find any reference in the documentation about EXTERNAL_DB, etc.

So with a docker-compose like this I would have a single image version of Compreface MobileNet version, with an external Postgres (they have to manually create an empty db, and the user/pw), and optimized regarding java memory requirements correct?

version: '3.4'
services:
  compreface:
    image: exadel/compreface:0.6.1-mobilenet
    restart: unless-stopped
    container_name: "compreface"
    ports:
      - "8000:80"
    environment:
      - POSTGRES_USER=compreface
      - POSTGRES_PASSWORD=password
      - POSTGRES_DB=comprefacedb
      - POSTGRES_URL=jdbc:postgresql://postgres.mydomain.lan:5432/comprefacedb
      - EXTERNAL_DB=true
      - API_JAVA_OPTS=-Xmx1g
      - ADMIN_JAVA_OPTS=-Xmx1g

@pospielov
Copy link
Collaborator

yes, correct
https://github.com/exadel-inc/CompreFace/blob/master/docs/Installation-options.md#single-docker-container
here is the documentation

@alexdelprete
Copy link
Author

alexdelprete commented Jan 11, 2022

Thanks for confirming. Docs are very well written.

I tried spinning up the single container version on a small debian server, a virtual machine on proxmox: I assigned 15GB of storage space, and during startup it went out of space. So I gave it 50GB, and restarted: it went up to 48GB, but it was running. :(

Is it normal? Why is it taking up so much space? The DB was external, my central postgres server.

@pospielov
Copy link
Collaborator

pospielov commented Jan 12, 2022

It definitely shouldn't be like this...
Unpacked exadel/compreface:0.6.1 image takes 4.32GB of disk space according to docker, exadel/compreface:0.6.1-mobilenet image takes only 2.35GB of disk space.
image
After starting exadel/compreface:0.6.1-mobilenet container, the disk usage didn't increase. However RECLAIMABLE space is decreased, so I expect that this is the data I created before, so probably it still takes about 1Gb space more.
Still, it's not even 15Gb of disk space

Could you run:
docker system df
docker images
docker ps --size
and share results?

@alexdelprete
Copy link
Author

I created a new LXC container on Proxmox, 20GB of space, Debian 11, updated with latest patches: 700MB more or less before the setup of Compreface.

Docker-compose has just completed this:

version: '3.4'
services:
  compreface:
    image: exadel/compreface:0.6.1-mobilenet
    restart: unless-stopped
    container_name: "compreface"
    ports:
      - "8000:80"
    environment:
      - POSTGRES_USER=compreface
      - POSTGRES_PASSWORD=password
      - POSTGRES_URL=jdbc:postgresql://postgres.mydomain.lan:5432/comprefacedb
      - EXTERNAL_DB=true
      - API_JAVA_OPTS=-Xmx1g
      - ADMIN_JAVA_OPTS=-Xmx1g

And this is the ending result of the setup:

image
image

During the setup I had to enlarge the disk 4 times, up to 40GB. :(

Here are the requested commands' outputs, and it seems you're right, but Proxmox is telling me that almost 38GB have been used after the setup. This is one of the reasons why I don't like docker...too much magic happening in the backstage. :)

root@compreface:/usr/local/etc/mycontainers/compreface# docker system df
TYPE            TOTAL     ACTIVE    SIZE      RECLAIMABLE
Images          1         1         2.397GB   0B (0%)
Containers      1         1         71.15kB   0B (0%)
Local Volumes   0         0         0B        0B
Build Cache     0         0         0B        0B
root@compreface:/usr/local/etc/mycontainers/compreface# docker images
REPOSITORY          TAG               IMAGE ID       CREATED       SIZE
exadel/compreface   0.6.1-mobilenet   66a2b86515ca   4 weeks ago   2.4GB
root@compreface:/usr/local/etc/mycontainers/compreface# docker ps --size
CONTAINER ID   IMAGE                               COMMAND                  CREATED         STATUS         PORTS                                             NAMES        SIZE
337ab87ba67c   exadel/compreface:0.6.1-mobilenet   "/usr/bin/supervisord"   8 minutes ago   Up 7 minutes   3000/tcp, 0.0.0.0:8000->80/tcp, :::8000->80/tcp   compreface   73.7kB (virtual 2.4GB)

@alexdelprete
Copy link
Author

Maybe @bentasker can shed some light about this...

@alexdelprete
Copy link
Author

I digged into it a little bit, looking for the top 5 folders by size:

39526576	.
38836862	./var
38678736	./var/lib
38620922	./var/lib/docker
38616983	./var/lib/docker/vfs

The beast is vfs/dir, top 5 in there:

root@compreface:/var/lib/docker/vfs/dir# du -a | sort -n -r | head -n 5
38617420	.
1372400	./fda1856999f8067977a926ddf996a6e1a0662d568f5f60a5fa87ba7f7d5380da
1371829	./82d04c404aa2d8f9126d09570da5be3226230abf580cadaee13e34ac0bd3fd31
1371820	./fda1856999f8067977a926ddf996a6e1a0662d568f5f60a5fa87ba7f7d5380da-init
1371809	./74c77d64fb806da605adabe72be0b9656cca36d5d902afa7a42d82676ef83da8

Checked one of them:

root@compreface:/var/lib/docker/vfs/dir/fda1856999f8067977a926ddf996a6e1a0662d568f5f60a5fa87ba7f7d5380da# cat startup.sh
#!/bin/bash

# EXTERNAL_DB defines if we need to run internal DB
external_db=${EXTERNAL_DB:-false}
if [ "$external_db" = false ] ; then
    # restore default data if it was cleared by volume creation
    if [ -z "$(ls -A $PGDATA)" ]; then
       echo "Postgres directory is empty. Copy default values into it"
       cp -r /var/lib/postgresql/default/* $PGDATA
    fi
    # change permissions in case they were corrupted
    chown -R postgres:postgres $PGDATA
    chmod 700 $PGDATA

    echo Starting compreface-postgres-db
    supervisorctl start compreface-postgres-db
fi

# wait until DB starts
sleep 10
echo Starting compreface-admin
supervisorctl start compreface-admin

# wait until compreface-admin make all migrations
sleep 10
echo Starting compreface-api
supervisorctl start compreface-api

# wait until compreface-admin starts
sleep 10
echo Starting compreface-fe
supervisorctl start compreface-feroot@compreface:/var/lib/docker/vfs/dir/fda1856999f8067977a926ddf996a6e1a0662d568f5f60a5fa87ba7f7d5380da#

Checked another one, looks the same...I guess they're all the same...

root@compreface:/var/lib/docker/vfs/dir/74c77d64fb806da605adabe72be0b9656cca36d5d902afa7a42d82676ef83da8# cat startup.sh
#!/bin/bash

# EXTERNAL_DB defines if we need to run internal DB
external_db=${EXTERNAL_DB:-false}
if [ "$external_db" = false ] ; then
    # restore default data if it was cleared by volume creation
    if [ -z "$(ls -A $PGDATA)" ]; then
       echo "Postgres directory is empty. Copy default values into it"
       cp -r /var/lib/postgresql/default/* $PGDATA
    fi
    # change permissions in case they were corrupted
    chown -R postgres:postgres $PGDATA
    chmod 700 $PGDATA

    echo Starting compreface-postgres-db
    supervisorctl start compreface-postgres-db
fi

# wait until DB starts
sleep 10
echo Starting compreface-admin
supervisorctl start compreface-admin

# wait until compreface-admin make all migrations
sleep 10
echo Starting compreface-api
supervisorctl start compreface-api

# wait until compreface-admin starts
sleep 10
echo Starting compreface-fe

@pospielov
Copy link
Collaborator

I wanted to see what is inside this directory on my machine and I didn't find it.
I tried to google this folder and this is what I found:
https://docs.docker.com/storage/storagedriver/vfs-driver/

How the vfs storage driver works
VFS does not support copy-on-write (COW), so each time a new layer is created, it is a deep copy of its parent layer. These layers are all located under /var/lib/docker/vfs/dir/.

It looks like this storage driver uses disk space in a very non-optimal way. Here is from docker documentation:

The vfs storage driver is intended for testing purposes, and for situations where no copy-on-write filesystem can be used. Performance of this storage driver is poor, and is not generally recommended for production use.

Not sure why it is chosen on your docker version

@bentasker
Copy link

Maybe @bentasker can shed some light about this...

Sorry, been busy.

@pospielov is correct, this is because your docker instance is using VFS, it's quite inefficient compared to Overlay(2). VFS gets used on a variety of devices/environments though because it's robust and supported (more or less) everywhere.

But, @pospielov, the image does use a lot of layers, can I suggest removing some of those? In the dockerfile rather than having lots of RUN directives, chain commands where possible.

As a particular example

RUN apt-get update && apt-get install -y lsb-release
RUN sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
RUN wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add -
RUN apt-get update && apt-get install -y postgresql-13 \
    && rm -rf /var/lib/apt/lists/*

You'll have 4 layers here, the first 3 of which will contain all of apt's files (so probably a couple hundred meg).

If instead you do

RUN apt-get update && apt-get install -y lsb-release &&\
sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list' &&\
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add - && \
apt-get update && apt-get install -y postgresql-13 \
    && rm -rf /var/lib/apt/lists/*

Then you'll have a single layer, and apt's files will never be included in it.

That way VFS users won't see massive storage usage.

@alexdelprete
Copy link
Author

alexdelprete commented Jan 14, 2022

Hi @bentasker / @pospielov,

I just finished migrating all my physical servers to a central proxmox server. I'm using mainly LXC containers to optimize resource usage. The LXC container I used for Compreface is a clean and minimal Debian 11, headless, only openssh+docker+compose plugin.

After the container went up, I used compose to spin up the service using this:

version: '3.4'
services:
  compreface:
    image: exadel/compreface:0.6.1-mobilenet
    restart: unless-stopped
    container_name: "compreface"
    ports:
      - "8000:80"
    environment:
      - POSTGRES_USER=compreface
      - POSTGRES_PASSWORD=password
      - POSTGRES_URL=jdbc:postgresql://postgres.mydomain.lan:5432/comprefacedb
      - EXTERNAL_DB=true
      - API_JAVA_OPTS=-Xmx1g
      - ADMIN_JAVA_OPTS=-Xmx1g

As soon as it starts unpacking, I noticed the storage being consumed very very fast, and the process interrupted several times with out of disk errors. So I had to expand it 4 times, adding 5GB each time, reaching 40GB, only to complete the installation process. I didn't configure Docker specifying VFS, and the installation of docker was made with their official procedure.

I'm no docker fan nor docker expert, where should I check for VFS config, and modify it with overlay2? Is it something that has to be done at docker's config level or in the compose file?

Thanks for the help.

@bentasker
Copy link

It's configured at the docker level, so has a system wide effect, there are docs here - https://docs.docker.com/storage/storagedriver/overlayfs-driver/

It doesn't sound like there is in this case, but it's a little more a pain if you've got existing/other containers you want to preserve (as you start having to copy stuff about).

I've never tried it within a LXC container though

@alexdelprete
Copy link
Author

I've installed a central Docker LXC container with 15 services on it, and it's occupying very little space and also memory, very happy about it. I think it doesn't make sense to redo everything again, as it depends on how the specific docker service is built. Probably those 15 I installed are pretty simple and I don't notice the effect I had with Compreface.

Strange thing is that I now started a Debian 11 virtual machine, and installed Compreface on it with no issue, because it's using overlay2 in there. Problem is that the VM consumes a lot of resources respect to the LXC container.

Thanks for all the explanations Ben, lot to learn about these things...and I'm just starting, because I never really liked Docker...maybe I'll start taking a look at Podman, they told me there's less overhead there...and it should be compatible with docker configs and compose files.

@pospielov
Copy link
Collaborator

@alexdelprete Looks like this is a common problem with proxmox LXC and docker:
https://forum.proxmox.com/threads/docker-in-lxc-container.45204/
https://forum.proxmox.com/threads/docker-lxc-unprivileged-container-on-proxmox-7-with-zfs.99796/
https://forum.proxmox.com/threads/another-docker-experience-on-proxmox.84770/

@bentasker This dockerfile was created in such a non-optimal way to simplify its support. It consists of parts of original dockerfiles. If I optimize it, it will be harder to update it if we update another docker files. But I believe it won't help. As alexdelprete does not build CompreFace in his servers, he just uses it.
So when any container creates or changes file, it doubles its size. It could potentially be even because of logs and server temp files that we can't control.

@alexdelprete
Copy link
Author

Not sure why it is chosen on your docker version

I found out that when using ZFS, and you install docker in a LXC container, docker will revert to VFS for compatibility issues with overlay2 and ZFS. :(

Now I created a VM to bypass the issue, so in there overlay2 is the default and Compreface installs correctly. :)

The only problem is that the mobilenet version didn't work correctly, gave me a lot of errors, couldn't even create a new app/service after the login. So I reverted to 0.6.1-facenet.

@alexdelprete
Copy link
Author

alexdelprete commented Jan 15, 2022

If I optimize it, it will be harder to update it if we update another docker files

but in the single-image version, it could actually be optimized since it's not the typical one that will be built, right?

Thanks for the links, the 2nd one https://forum.proxmox.com/threads/docker-lxc-unprivileged-container-on-proxmox-7-with-zfs.99796/ looks quite promising to bypass the problem, I'll test it. Thanks.

@alexdelprete
Copy link
Author

@bentasker solved it with the fuse-overlayfs driver in the container: it's compatible with ZFS. Credit to this guide.

Now I have a container which is the main docker system, that manages 12 containers, and it's on VFS. :(

I wonder if I can change its driver and restart it without much hassle...hope I don't have to redo everything from scratch. What do you think Ben?

@leccyril
Copy link

any documentation on main branch for this case ? it is great to have only one container for people working on some functionalities and allow only one container...
great job

@pospielov
Copy link
Collaborator

https://github.com/exadel-inc/CompreFace/blob/master/docs/Installation-options.md#single-docker-container
Here is the documentation of how to use the single image version of CompreFace.

@LordNex
Copy link

LordNex commented Oct 30, 2022

Since the 1.10 update, I haven't been able to get this to install through Home Assistant. Last error was


The command '/bin/bash -c apt-get update && apt-get install jq -y && rm -rf /var/lib/apt/lists/*' returned a non-zero code: 100

@pospielov
Copy link
Collaborator

Do you build it from scratch or use the published image?
I just tried the published image, and it works fine, at least the default one.

I also don't get at what stage you have this error, as I don't recall us installing or use anywhere jq tool

@Iceman248
Copy link

Has anyone used CompreFace with postgres 14?

@jjvelar
Copy link

jjvelar commented Jan 24, 2023

Hi!
How can I access CompreFace login page from Nabu Casa remote UI?
Apparently I can't access port 8000 from Nabucasa remote UI.
Thanks and best regards,

Jose

@SgtBatten
Copy link

I have a reasonably stable install using the HA addon, but wanted to take the workload off that and put it on my unraid server. Starting from scratch with a clean install i am having issues where after some time (usually after interacting with deleting traned images or similar) something breaks and compreface cannot be recovered.

@pospielov
Copy link
Collaborator

This is the worst thing about the Single Container version. It's hard to check what is wrong.
What is in logs?

@toddstar
Copy link

I have a reasonably stable install using the HA addon, but wanted to take the workload off that and put it on my unraid server. Starting from scratch with a clean install i am having issues where after some time (usually after interacting with deleting traned images or similar) something breaks and compreface cannot be recovered.

If you removing it from HA then I'd just use standard install process as you're no longer limited to just 1 container

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests