-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streamline the build and use of the Python image #284
Conversation
That's weird. I'll check this on the other side of the program, this should not happen. Can you please apply these changes to the dev dockerfile as well? Thank you so much! |
Hi, I think we could improve the Docker further:
|
I did and this time I remembered to apply them to the other composefiles as well.
I'd rather leave it explicit for better readability. Some time ago simjanos mentioned a valid concern of docker becoming a bit of a black box to him where they don't know what is going on inside, and that line will remind him where inside the container are models for extra languages installed should they ever forget.
Would this be actually helpful given that we don't pin the package version when installing? Having to bump them manually adds a bit of maintenance burden that this project cannot afford (at least not right now).
That is already done for the image that is deployed to the end user. You are probably looking at the composefile that simjanos uses for development, where it is mounted to allow for hot reloading any changes made without having to rebuild the image. |
Yeah, I was looking at wrong Docker image, now it makes sense to me. I think the requirements.txt is just mostly needed for the freeze of the deps. I get the it adds the maintenance burden, but it could be risky in case something goes wrong or there is a change that affects the tokenizer code. But I think that mostly to @simjanos-dev and his preferences and needs at the time. If it works well now, it's something that could be revisited in the future. |
Can you please resolve the conflicts? I'll merge this in after that. I am not familiar with github and open source processes. If I get a PR and it gets a conflict, should I resolve it, or should I ask for it to be resolved? I do not want to push work to people that would be expected from me. |
This commit switches the base image from ubuntu to python:slim as it makes for a more sensible default that simplifies the build process without compromising the final size. It also moves the setup of the environment variable for the PYTHONPATH and the command to run the main process inside the container to the image itself, where they belong as they are not configurable and any change would break the service.
This commit fixes `sortedItems` not being defined if `sortMethod` was neither "default" nor "spine" which raised an exception, by having it take the default value whenever `sortMethod`is different from "spine". I'm uncertain of when or what introduced this bug.
I don't think you will find a single answer to that, but I personally don't mind resolving the conflicts |
Thank you so much!
That's what I'm afraid of when I try to resolve conflicts too. But probably I should get more comfortable with it.
Will this part be also installed with the new version? I don't remember what, but I remember something needed this installed. |
Yes, |
Although I merged this into dev, I didn't have the time to test it, and did not rebuild my image yet. @mateuszmrw wrote this to me on discord. I will take a look into it, but in the meantime I thought I copy it to you too, maybe you know what's wrong just by looking at it. I have this error after running:
I added --build to build a fresh container.
I attached the whole log in the message.txt file:
|
Removing this line from Python Docker image fixes it:
|
I think removing it would break the extra languages, does it? If it does, try moving it after everything is installed, right before the I will check it more in a couple of hours. |
Could not reproduce, the image built fine from scratch (no cached layers) and I was able to install Japanese and import text in both it and Spanish. I see the issue is related to the MacOS compose file, so it might stem from trying to build an AMD64 image from an ARM system, or at least that's my best guess. |
I got the same error after running these 3 commands:
I'm going to try this again with the change you suggested, will take 30~ minutes, I'll report back:
|
Same.
|
Tried adding numpy to the pip install chain. I got the same error.
|
Running the same steps just for mac-os:
With this image:
Makes the image "work", other than the installing other languages is fricked. |
This does not makes sense, but I'll try it anyways: maybe the problem is that the pythonpath for some reason overwrites the default one, instead of being added to it. But based on the old docker-compose file that would do the same. I replaced it like this. I'll try it again.
|
Nope, same result. But based on this:
I think the problem is that it want to import regular staff from the |
I've tried it with this dockerfile, did not work. However, I deleted my models folder, and it works perfectly now (still testing). Could it be maybe because the new python docker image have different python version? But whatever the reason is, this would be a problem for already existing users.
|
I just realized this is the live dev environment, which means files are not copied to the image but mounted during runtime (I even mentioned it before) while I was only testing the full image all this time; no wonder I couldn't reproduce. At least I think it is safe for users upgrading. I will partly revert the changes to the dev tooling to see if that solves it. |
Thank you! I haven't had time yet, but I'll backup my prod folders, and test with those as well later. |
So... I was overconfident. The dev environment is working now, but I did not back up my prod folder. But I ran into the same issue. Created a dev image, pulled, started the server, and the same numpy error happened. Shut down my server, deleted the model folder, and it worked again. Just to be clear, this was on my prod environment with the dev image. However I made a completely unrelated mistake, and the webserver cannot reach the database, so I'm building a new dev image now. If you want to reproduce, I think these are the steps:
Wait.... I'm an idiot. The docker-compose.yml file was modified, however, I did not update that in my prod environment. Is that something by design, or something that you overlooked too? We do not have a process for automatically pulling the latest docker-compose file from github anymore, and I don't think it should be expected from people to check for changes before they update. |
My brain defaulted to this since that is what other "unstable" projects like Immich do, but now I realize Linguacafe aims for a completely different demographic and we should be more explicit about the need to download a new composefile. I don't expect to introduce any more changes after this (moving the command to the image was the last important change) so hopefully this will be last time we annoy users with something like this. |
Last time I told people (and believed) that it would be the last install process change. This PR does make the process more standardized, and I want to have it, but does not address any important bugs or add any important feature. I would like to revert it back, if it's okay with you. I really don't want to bother people, and even if I create a message about this on the github update release and discord, I know there would be several people who would run into it, without knowing why. If there are more breaking changes that requires users any additional steps from users to update, I would like to introduce those in one update after linguacafe is in a much more developed state, so we can avoid asking people constantly to migrate. One change like that I have is rewriting the database migrations, because they are a mess. (By the way, this is my fault in the first place, because I did not have a proper docker process from the beginning.) |
Yes, that's fine by me. Maybe gather all these small changes and save them for an hypothetical 1.0 release instead of dropping them little by little along the path. |
Yeah, that's what I thought too. I've noticed there are some changes that #294 did not revert back from #284, like:
Could you please revert it back to the original working state, or if you would like to keep something, make sure it will work 100% for all users? I will delete my model folder before I update to it. I don't want to use my prod before it's fixed. My laptop is dying, and freezes for 10+ minutes after I build a new docker image. (Sorry for bothering you so much with it.) |
Sure, but I have a question, should we keep the second commit?
I want to know if that change is redundant and should be dropped or if otherwise you now depend on it and should be kept. The only other change worth keeping is the swap of the base image, the first line of both dockerfiles. |
Yes, we should definitely keep that. Thank you, I really appreciate it! |
Done at #299. |
Someone used the dev image I made today. They got the numpy error. The only thing I can think of is the different python version. I've just got a job, so I wont have a lot of times nowadays. Not sure when Ill get to testing this issue. Docker desktop got deleted from my PC, and now Im using plain docker. It stopped freezing after a build, so I will be able to test issues easier. |
I suspect this is the extra languages backfiring at us. When they were originally installed the whole system used a certain version of python and by extension of many other different packages. But know the half inside the container has been updated while the other half that is stored to a mounted folder is still stuck in the past. Reverting to the old base will solve the issue in the short term, but it will eventually come back at us because we can't keep using python 3.11 forever. It is solved by uninstalling and reinstalling the extra languages, as you already saw. I think it can be solved too by forcing an update too, so I will make some tests to create a function that updates all installed languages, and you can maybe call this either on user prompt or on the background when the user makes a big update. Does that sound good?
Congratulations, by the way. |
I'll do that, because I don't have much time and want to get the next update out.
I think it was an even older version. I'll check the main image. I think we should fix the version number to the old one, and update every 1-2 years, and display a message inside the software for the admin, that uninstalling and installing packages again are necessary. I think it is acceptable, because it is an easy task inside the UI, and has to be done rarely.
|
Just when I was relieved that we finally found the issue.... someone has the same error. He is using a fresh windows install, I'm 99% sure he is on the main image. I asked, will leave a comment if he says otherwise. |
So. I did not want to add any breaking change before v1.0, but suddenly there are so many things I want to add, and I have a feeling we will have more. I want to add websockets, which needs another open port and optional containers for other stuff like self hosted translation service, and in the future AI. Do you think it's reasonable to add updating the docker-compose file to the update process? I really don't want to make more problems for users. (My new PC is so fast compared to what I had, I won't have docker problems anymore.) |
@sergiolaverde0 I do not want to bother you, but what do you think about my last comment? I asked on discord, and they are okay with it. (Just saw native image for arm. Seems awesome!) |
I don't know how I missed both your previous comments, so it is good you mentioned me.
Is this really breaking? I would expect users that don't update won't be able to access this feature but otherwise everything will keep working.
Probably same as above.
Doing so every other version would probably be too annoying, but if we give ample warning for people to update we can make it work:
Ideally, only break things once or twice a year. People will generally understand. |
Yes. Importing texts will happen in the background in a job queue, and websockets will be used to update the opened book's chapters when an import has finished. Other than that, chapters' word counts will also be loaded with websockets, because that is the part that slows down opening a book, and importing dictionaries already have websockets for reporting real time progress in the latest version(features/websockets-vuex branch).
This part would not break anything.
These are great ideas I'll try to do it this way. v0.14 will have the first batch of breaking docker-compose.yml changes and probably won't be many after that, but I'll tell people to check for breaking docker-compose changes before every update. Updates will be less frequent, because I have only a limited time to work on linguacafe now with a job, and I'll mostly do it on the weekends. I'll also add this PR's changes again. Other kind of breaking changes are still collected for v1.0. |
Actually, I think you should schedule the breaking changes for v0.15, and announce them as soon as you feel comfortable (you are technically promising a feature, which can backfire). |
I've already announced it on Discord, and it is getting close to being finished. Will probably add it to the top of the readme file as well. I'm also working on this/prioritizing it to learn for my job, so I kind of have to actually finish it. And like I said, the next update will probably take 2~ months at least due to my limited time, or more if needed. But I see your point, it will have to be well tested and work properly. |
The first commit performs a series of very small refactors that I think make sense, such as moving the non configurable settings from the composefile to the image itself, and using a python base image instead of an ubuntu one just to install python on it immediately.
The second commit "fixes" a bug I encountered when testing the import book functionality where
sortedItems
would not be defined, most likely becausesortMethod
was neither of the options checked, which raised an exception when trying to access it later.I made it so that it gets the default behaviour whenever the "spine" option is not selected, and isolated it on its own commit so it can be easily left out if you'd rather address the issue by different means.