Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: Docker images that track this repo #69

Open
samyak-jain opened this issue Jan 2, 2021 · 10 comments
Open

Suggestion: Docker images that track this repo #69

samyak-jain opened this issue Jan 2, 2021 · 10 comments

Comments

@samyak-jain
Copy link
Contributor

Right now, the instructions ask us to use the following repo: https://github.com/ankicommunity/docker-anki-sync-server to deploy this server using Docker.

IMHO, I think the following

  • This is the wrong approach to take. Making a docker image is fairly straightforward. There is no reason this has to be in a seperate repository
  • The docker repo linked doesn't even track this repo despite the fact that this is the most actively maintained version of the server. The last commit for this repo was 4 days ago whereas for tsudoku's version is almost a year ago.
  • Because of this, the official docker image does not contain critical fixes needed for the application to function in many use cases. For example, since I use Arch Linux, any version of the server which does not contain a fix like Fix issue with arch linux client #57 will be completely broken.

I think the best approach here should be to include a github action so that the latest tag of the official docker image tracks the master of this repo.

@samyak-jain
Copy link
Contributor Author

@VikashKothary I am willing to help contribute on this. We can probably just use the Dockerfile from https://github.com/ankicommunity/docker-anki-sync-server. Let me know if you agree with my assessment.

@VikashKothary
Copy link
Member

VikashKothary commented Jan 5, 2021

Hi @samyak-jain,

Thanks for your suggestion. It's a really interesting topic and it think this a very valid suggestion, which comes with both pros and cons. Let's break it down.

I want to apologise in advance because I think this will be a common question by many contributers and users as such I want to document my reasons for the current setup.

The docker repo linked doesn't even track this repo

Firstly, what you're saying is very true. The fact that our docker images doesn't use this repo for its builds, is in my opinion the most critical issue in the community currently. I believe that it is in the process of being fixed, see: https://github.com/ankicommunity/docker-anki-sync-server/pull/21/files and ankicommunity/anki-devops-services#13. Feel free to join the conversation and review the PR that has been made.

Secondly, I would agree that including the Docker image as part of the source code repos is a more common practice compared to separate repos for it. The current setup is my current process but it's one that also comes with its own pros and cons. So I'll try and explain my thought process and I'd love to hear what you think.

Making a docker image is fairly straightforward.

1) The Docker repo is more than just a Dockerfile.

See my comments on all the different steps I follow when developing and deploying containers: ankicommunity/anki-devops-services#11 (comment). While writing a single Dockerfile is simple enough, the comment lists 5 different outputs/steps which need to be created to properly deploy the container. Which some users will be happy enough with just a Docker image, other users will feel that is not enough. @kuklinistvan and @AntonOfTheWoods do a great job of arguing both sides in the ticket and I'd recommend a read because it's really insightful.

2) Dockerfiles should be linted, tested and documentated.

What wasn't mentioned in the ticket above, is that all 5 outputs are just source code (configuration as code). Dockerfiles like any other project should be linted for easy readability. This includes all bash/python scripts included in them that have nothing to do with this project itself.

Dockerfiles can also be tested to confirm that the Dockerfiles work. I believe this is an important step to prevent shipping broken images.

Dockerfiles also have different documentation in relation to them. This could involve describing how the images can be customised (using environment variables), also information on compatibility with different cloud services.

3) A development Dockerfile is not a production Dockerfile

The next point I want to make is the a Dockerfile included in repos tend to favour development. An example of this is using a Ubuntu base image over a Alpine or Distoless base image. This makes sense but makes the images a lot more chunky and could cause security issues when running in production.

4) Dockerfiles multiply like bunnies

Currently this repo requires two Docker iamges. Nginx and Python. A single centralised could support both. In the future, it can support others include databases which have been preconfigured to work with the Anki servers.

5) It's a specialised skill.

The idea is that this repo is for Python developers and the Docker repo is for DevOps Engineers. We also have Djankiserv which includes its own Docker images so if a DevOps Engineer wants to make a change to all the images, they can do so all in one repo. This should scale better.

7) Is this the perfect setup? No.

The biggest downside for this approach is the added complexity. This is the reason I have doubts with this setup however it does seem to scale better. If you know how we can maintain the benefits in a simpler workflow then I'd definitely be keen.

Let me know your thoughts.

@samyak-jain
Copy link
Contributor Author

Thanks for the thoughtful and detailed response! Here are my 2 cents:

  1. That's true and if the community is interested in supporting multiple deployment methods, that's great. However, I think at the very least, the docker image should be supported. For someone like me, I just need this for personal use for syncing the data across devices. I just use a docker compose setup because anything above that would be overkill. I suspect that this would be the case for a significant chunk of the people who use this server.
  2. I think if you're worried about too many configuration files cluttering this repo, it would be best if the Dockerfile alone is moved to this repo (along with any scripts required to build the image). I can understand some things being in a seperate repo. For example, if I was creating an ansible role to deploy anki-sync-server, I can imagine putting it in a different repo. Regaring linting, I don't see how that would stop us from moving the image to this repo. And regarding documentation, we can choose to add it to the Readme or create a wiki, or any of the multitude of options that exist nowaday. Whatever works.
  3. Agreed. We can have a tracking issue on this if you want unless there's already a PR for this. I don't think this should be a blocker though
  4. Not sure I understand your point. Git was built for collaboration. If people want to change just the Dockerfile, they can make the PR just to that file. They don't need to read the python codebase to do that. Pretty much every other project does it this way.

One reason I insist on having the Dockerfile alone in this repository is so that we can setup CI to get the latest version of the anki server. For example, if any update to AnkiDesktop breaks the server, and sometime later this project patches up the issue. I will have to wait for someone to manually update the docker image which as you are aware is not even tracking the only active version of the server.
IMO, having an up to date docker image would, for the vast majority of people, be enough. If people want a custom setup that relies on the image, they can do that. Generally people with more sophisticated setups know what they are doing.

I understand a lot of the points you bring up here. But I think if we keep thinking about how we build production grade infrastructure and build the most perfect DevOPS setup, we are letting the perfect be the enemy of the good.
IMO, we should ideally have the Dockerfile here and setup github actions to automatically publish to DockerHub. Let me know your thoughts.

@samyak-jain
Copy link
Contributor Author

Missed your 4th point

  1. I'm not sure what you are referring to here? I don't see mention of nginx in the docker repo. If you're referring to using nginx as a web server. Of course, every server in the world uses some kind of web server whether it is nginx, apache etc... And plenty use proper databases. You don't need seperate dockerfiles for that. For something like using nginx, that is so basic I don't see any point in adding any special documentation for that. Regarding database, if we ever go that route, maybe we can provide a sample docker-compose in our documentation. Since these use cases are so basic, anyone using a more sophisticated setup will know what to do. There can be some times when we need to create a new dockerfile but IMO that would be very rare and I don't see that happening for this project anytime soon. So I don't see this being a concern. Let me know if I misunderstood what you're referring to.

@VikashKothary
Copy link
Member

Hi @samyak-jain,

Okay I'm going to take a step back and discuss not the how but the what for this issue.

Am I correct to understand, that this issue would be solved, if there was a CI build, that released a new Docker Image whenever a commit was made to either the master or develop branches in the repo, regardless of where the Dockerfile is?

Or is this a ticket, about the best practices in managing Dockerfiles?

@samyak-jain
Copy link
Contributor Author

Yeah. I think we can consider this issue resolved if a CI is setup to track this repo. If you'd like, discussion of best practices can move to a different issue.

@samyak-jain
Copy link
Contributor Author

Hey @VikashKothary , just a friendly ping. I am willing to take this on. Let me know if it's ok if I work on this.

@VikashKothary
Copy link
Member

Hi @samyak-jain,

So another contributor raised a similar issue. And like I said in that ticket, what you're saying makes sense. But I'm currently in the process of improving our build process, so if you continue to think there is an issue after I've made my changes, then I'm happy to discuss this further.

I know you wish to see the latest changes from this project being deployed to our Docker images so you can use it without all this additional hassle, so I will try and make the changes ASAP.

@samyak-jain
Copy link
Contributor Author

Sounds good! Thanks for working on this. For now, I'm using my fork which has Dockerfile + Github Actions: https://github.com/samyak-jain/anki-sync-server. Feel free to leverage any part of this in your efforts. Also let me know if there's anything that you feel requires help and I'd be happy to chip in.

@VikashKothary
Copy link
Member

Hi @samyak-jain,

We're making progress on this issue. We've implemented Github Actions in the devops repo and I hope to create a release to update the image on Dockerhub. See ankicommunity/anki-devops-services#11 for more information on that.

While both tickets cover similar topics, I want to keep this ticket open because you make a good point that changes to this repo should trigger Docker builds and that hasn't been implemented. So I think this ticket should cover the changes needed in this repo to make that happen.

Until that has happened, feel free to join the conversation on the other topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants