Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question regarding the AGPL-3.0 license #2129

Closed
1 task done
algomaks opened this issue Apr 19, 2023 · 36 comments
Closed
1 task done

Question regarding the AGPL-3.0 license #2129

algomaks opened this issue Apr 19, 2023 · 36 comments
Labels
question Further information is requested Stale

Comments

@algomaks
Copy link

algomaks commented Apr 19, 2023

Search before asking

Question

I just saw that Ultralytics has adopted a new licence - AGPL-3.0 - and on their website it says: "Yes, all Ultralytics YOLO-trained models fall under the AGPL-3.0 License. The AGPL-3.0 License covers the training code and the models produced by that training code."

My question is - if I train a new model on my own dataset using the framework provided by Ultralytics, does my newly trained model fall under AGPL-3.0 even though I do not use any Ultralytics code for inference and it is exported to ONNX?

In my opinion the answer should be - clearly no. My logic is that, for example if I use a drawing software under AGPL-3.0 to draw a few images, my new images do not fall under AGPL-3.0. If I use a word processor to write a book, my book does not fall under the license of the software. So, do my trained models fall under AGPL-3.0 in your opinion?

I would like to hear your opinion on this topic. Thanks.

@algomaks algomaks added the question Further information is requested label Apr 19, 2023
@github-actions
Copy link

github-actions bot commented Apr 19, 2023

👋 Hello @algomaks, thank you for your interest in YOLOv8 🚀! We recommend a visit to the YOLOv8 Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Install

Pip install the ultralytics package including all requirements in a Python>=3.7 environment with PyTorch>=1.7.

pip install ultralytics

Environments

YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

Ultralytics CI

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

@algomaks
Copy link
Author

algomaks commented Apr 20, 2023

Thanks @glenn-jocher, your answer makes sense and answers my question.

Just one clarification, when you say "use any derivative of YOLOv8 in your training pipeline" do you mean one of the pre-trained models? In other words, if I use my own dataset to fine-tune one of your pre-trained models, then the fine-tuned model falls under AGPL-3.0?

@glenn-jocher
Copy link
Member

@algomaks I checked with our legal team! This is how it works: both the training code and models are under AGPL, so any downstream solution that includes YOLOv8 training code or a trained model (both pretrained or custom trained) inside it should be open-sourced to comply with the AGPL conditions.

If you'd prefer not to open-source your work we have enterprise licenses available at https://ultralytics.com/license

Let me know if this answers your question!

Screenshot 2023-04-20 at 11 14 56

@algomaks
Copy link
Author

Hi @glenn-jocher,

Thanks for taking the time to provide a clarification.

I understand and agree with the part regarding the training code and pretrained models. I also understand and support your effort to protect your work, especially from larger corporations which might decide to compete with you using your own code.

With all due respect to your legal team, I think AGPL does not work at all in this way when it comes to custom trained models (trained on own datasets from scratch). These models are not part of the software, they are the data output form the usage of the software and as such they do not fall under AGPL.

Here is an example: If a programming language interpreter is released under the AGPL, does that mean programs written to be interpreted by it must be under GPL-compatible licenses? The short answer: No. The interpreted program, to the interpreter, is just data. It is quite logical.

@Haniaaliii
Copy link

Hello @glenn-jocher
I wanted to understand better the limitations of the commercial use permission provided by AGPL please,
do if I have the right to adapt some of the code in YOLOv8 to my own code purposes and then do whatever I want with it ? if yes are there any steps that needs to be done before doing so ?
Same question for the GPL license for YOlOv7
Thank you !

@glenn-jocher
Copy link
Member

Hello @Haniaaliii, to clarify: any modifications to AGPL software or inclusion of modified AGPL software in your own project means your project as a whole must be licensed under AGPL as well. For GPL, your project as a whole would need to be licensed under GPL as well. If you follow all licensing terms, you are free to adapt the code for your own purposes as long as the licensing agreement is upheld. I recommend speaking with a lawyer to ensure you comply with all requirements under the licensure agreements.

@github-actions
Copy link

github-actions bot commented Jun 2, 2023

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

@github-actions github-actions bot added the Stale label Jun 2, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 14, 2023
@awiegersma
Copy link

awiegersma commented Jul 28, 2023

@glenn-jocher

I checked with our legal team! This is how it works: both the training code and models are under AGPL, so any downstream solution that includes YOLOv8 training code or a trained model (both pretrained or custom trained) inside it should be open-sourced to comply with the AGPL conditions.

This is not how it works according to information I obtained:

If a neural network is trained using AGPL (Affero General Public License) software, the neural network itself does not automatically fall under the AGPL license. The AGPL license primarily applies to the software’s source code and any modifications or derivative works based on that code.

The output or results produced by a neural network trained with AGPL software would generally be considered data rather than software source code. As such, the output or results, including the trained neural network model, are not bound by the AGPL license.

Can you explain where in the AGPL it is stated that data also is covered by the license or ask your ‘legal department’ about this?

@glenn-jocher
Copy link
Member

@awiegersma hello, thank you for your valuable input.

The matter of whether trained models are considered part of the software or just mere output is indeed a complex issue and can be interpreted in different ways.

The statement you obtained, which describes the trained models as data rather than software source code, is a significant viewpoint. However, our interpretation is based on viewing the weights of a trained model as a transformation of the original software, not merely as independent data. Hence we believe they fall under the protective umbrella of AGPL.

While the explicit mentioning of data being covered by a license might not be directly stated in AGPL, the interpretation can vary based on the definition of derivative works or transformations, and legal perspectives may differ.

We understand the importance of this matter and the need for clear cut clarity. We are closely working with our legal team to elucidate these complex licensing matters. Your concerns are greatly appreciated.

Thank you!

@awiegersma
Copy link

awiegersma commented Aug 1, 2023

@glenn-jocher

However, our interpretation is based on viewing the weights of a trained model as a transformation of the original software, not merely as independent data. Hence we believe they fall under the protective umbrella of AGPL.

This is a pretty strong claim that you should validate and not merely believe. The weights of a neural network are not considered software in the sense of a computer program or executable code. Weights are numerical values and are therefore just data.

I don’t see any reason to stick to the AGPL when using custom trained Yolo v8 networks.

@glenn-jocher
Copy link
Member

@awiegersma thank you for your insights.

The interpretation of GPL depends highly on the jurisdiction and the context. We acknowledge that our interpretation may differ from others. It touched on the transformative nature of machine learning and how a model trained using a base framework should be classified. We see the characteristics of the model weights more as a derivative of the software, while others see them as data.

We understand that there are multiple interpretations of this issue. That's why we strongly recommend every user consult their own legal counsel to ensure compliance with open-source licensing in their specific use case. We always strive to encourage an open, collaborative environment under fair licensing terms.

We appreciate your understanding and your contributions to this important discussion.

@queenley
Copy link

Hi @glenn-jocher, thank you for your support.
I have a case like @awiegersma's.

I also agree that using custom-trained YOLOV8 networks needn't follow the AGPL license. But I also hope that you can confirm exactly this case.

@bit-scientist
Copy link

Thanks @glenn-jocher, your answer makes sense and answers my question.

Just one clarification, when you say "use any derivative of YOLOv8 in your training pipeline" do you mean one of the pre-trained models? In other words, if I use my own dataset to fine-tune one of your pre-trained models, then the fine-tuned model falls under AGPL-3.0?

Hi, @algomaks. I see that your answer came right after the github actions' template. What answer of Glenn did you refer to? He (or someone else) must have removed it for some reason. Could you explain the terms in more depth? Thank you!

@glenn-jocher
Copy link
Member

Hi @bit-scientist,

Yes, your understanding is correct. If you use a pre-trained model provided by YOLOv8 and fine-tune it on your own dataset, the resulting fine-tuned model would fall under the AGPL-3.0 License. This is because the fine-tuned model is considered a derivative of the original pre-trained model. In this case, any use of the fine-tuned model would be subject to the conditions of the AGPL-3.0 License.

Let me know if you have any further questions!

@bit-scientist
Copy link

@glenn-jocher , I won't believe a word you say until you prove it's not glenn-gpt writing these 😆

@glenn-jocher
Copy link
Member

Hello @bit-scientist, thank you for your humorous comment! Please rest assured that all responses, including this one, are written by human team members. We take all questions seriously and strive to provide careful and detailed answers according to our knowledge and expertise. Even when it's not "glenn-gpt" at the keyboard, we're always happy to assist. Let us know if you have any questions about YOLOv8!

@syncsyncsync
Copy link

Hi @glenn-jocher ,
First of all, thank you for your hard work on this project . It's a fantastic tool, and I'm interested in using it for my project.

I noticed that the model weights are licensed under AGPL. Could I use these weights with my own non-AGPL inference program? Would you say that my script would need to follow AGPL's copyleft policy? For context, my script is designed to work with a variety of models, yours included.

Your input on this would be very valuable to me. Thanks in advance for your guidance.

@glenn-jocher
Copy link
Member

@syncsyncsync hello,

Thank you for your kind words and we're glad to hear that you find our project beneficial.

When it comes to your question, I want to clarify that using model weights that are under the AGPL license with a non-AGPL inference script technically does result in a combined work that is subject to the AGPL. This would mean any derivative works or adaptations based on the AGPL-licensed YOLOv8 model should be licensed under the AGPL as well.

However, licensing is a complex issue and interpretations of these regulations can vary. Hence, it's essential to seek legal advice to answer questions regarding the specifics of your situation. It's important to ensure you're complying with all the licensing requirements especially when mixing licenses in one project.

I hope this information is helpful, and feel free to reach out if you have more questions.

@syncsyncsync
Copy link

@glenn-jocher ,
Thank you.
I'm not a legal expert, and I'm not sure if the license terms fully match your interpretation.
But I understand your intention that
using your model for fine-tuning or building from scratch falls under the AGPL license, as does using inference software with the model.

To be honest, this is a little far from what I understood when I started using YOLO, but I belive your intentions should be respected.

@glenn-jocher
Copy link
Member

@syncsyncsync hi there,

Thanks for understanding our point of view and the licensing terms based on our interpretations. Indeed, as you have mentioned, our understanding is that fine-tuning using our models or using our models for inference, does adhere to the AGPL license implications.

It's a complex subject, and we completely understand how it can be confusing or different from initial perceptions about using open-source projects. It’s worth noting that these interpretations are primarily derived to keep the open-source spirit alive, while also setting some boundaries to respect the efforts of the developers and contributors who have given their time and knowledge to create these resources.

We value your respect for our intentions, and we, in turn, respect all contributors and users like you in the open-source community. Feel free to raise any further questions or concerns you may have, as we're here to help!

Best regards.

@TechnikEmpire
Copy link

@algomaks I checked with our legal team! This is how it works: both the training code and models are under AGPL, so any downstream solution that includes YOLOv8 training code or a trained model (both pretrained or custom trained) inside it should be open-sourced to comply with the AGPL conditions.

If you'd prefer not to open-source your work we have enterprise licenses available at https://ultralytics.com/license

Let me know if this answers your question!

Screenshot 2023-04-20 at 11 14 56

This doesn't seem true to me whatsoever. You can't automatically copyright program outputs based on the fact that you copyrighted the program. Blender 3D touched on this once and is a great example. The code is GPL, the outputs are not because they're not your creative works. Especially in this case, it's literally random weights tuned until an acceptable approximation is found and every model trained from scratch would be different.

@algomaks
Copy link
Author

@TechnikEmpire , thank you, this is exactly how I understand it too.

@TechnikEmpire
Copy link

@algomaks to be clear, its probably slightly a gray area with regards to their published models. Technically they produced them. Are the weights sufficient to qualify for copyright? Probably not but maybe. Training yourself, you get your own weights.

https://mwhlawgroup.com/can-copyright-in-software-extend-to-its-output/

There was one case in the States where someone argued that their software does the "lion's share", and so copyright should be theirs.

I'm not a lawyer / not legal advice etc. My recommendation is to just look at alternative models. Look up paperswithcode object detection. Whenever you see someone using the GPL for no reason other than funnel people into commercial deals, just look elsewhere. These companies rise and fall all the time.

@glenn-jocher
Copy link
Member

@TechnikEmpire, thank you for the link and the in-depth discussion. It’s great to see such engagement from the community.

You’ve made some salient points regarding copyright and how it applies to model weights. It is indeed a complex topic that even legal professionals often grapple with. Our interpretation of the AGPL license is that it covers both the training code and resulting models (whether pre-trained or custom trained).

However, we understand that interpretations can vary. As with any legal issue, especially those that are in a "gray zone," we always encourage consulting with a legal advisor to fully understand the potential implications for specific use-cases.

As for finding alternative models, we fully support the idea of exploring all options. The field of object detection has numerous excellent models, and it's always a good idea to choose the one that fits your specific needs and constraints the best.

Again, thank you for your insights. Your perspective is valuable and helps foster a more informed and nuanced discussion about these matters within our community.

@1andDone
Copy link

@glenn-jocher ultralytics/yolov5#4716 (comment)

The AGPL-3.0 license allows the use of YOLOv5 for commercial purposes, including projects for freelance clients. You can train the YOLOv5 model using your freelance clients' custom dataset and use it in your projects without needing to pay for an Enterprise license.

So in this case, the freelancers' clients would need to open source all of their work?

@glenn-jocher
Copy link
Member

@1andDone hello! If you train the YOLOv8 model using your client's custom dataset and implement it in your project, you will indeed be following the AGPL-3.0 license. This means that the derivative work (in this case, your implementation of the model) would also need to be made open source.

So, if your project containing the YOLOv8 model is distributed, the corresponding source code of your project, along with any modifications, should be made available to the end users under the AGPL-3.0 license. This is in line with the principle of copyleft underlying the AGPL-3.0 license.

However, please note that the above information is not legal advice and it is recommended that you, or your clients, consult with a legal professional to get the most accurate advice based on your specific situation. Licensing, especially for open source projects like YOLOv8, can be complex and it's best to ensure you're in compliance to avoid any legal trouble later on.

@rwightman
Copy link

@glenn-jocher

I ran across this thread trying to figure out if YOLO v8 in Keras (https://keras.io/examples/vision/yolov8/) (+ on Kaggle hub https://www.kaggle.com/models/keras/yolov8) was in violation of any ultralytics yolo licensing... not clear but looks like they may just be ported weights or maybe they ran it by you?

Actually interesting developments and discussions happening these days that may end up firming up the situation re training data copyrights and weight copyright licensing. One thing about weights themselves has has been pointed out in the past, copyright as it stands likely does not apply to them since they are not output of a creative human process. Indeed, Meta recently got shot down by GitHub after they talked to their lawyers about a takedown re weights https://github.com/github/dmca/blob/280652a060b86de87f223737ae54307a292fb96b/2023/04/2023-04-27-meta-counternotice.md

This mirrors similar arguments about the outputs of such models themselves (ie generative model outputs not being copyrightable)...

And without copyright, licenses like GPL/AGPL have nothing to stand on.

@TechnikEmpire
Copy link

@glenn-jocher just use YOLO-NAS. It's better and faster. Projects like this are always doomed to fail.

@glenn-jocher
Copy link
Member

@TechnikEmpire hey there! Thanks for the suggestion! 🚀 YOLO-NAS is indeed a great project with some impressive benchmarks. It's always exciting to see the advancements and variety in the YOLO ecosystem. Each project has its own strengths and use cases, and it's fantastic that the community has options to choose from based on their needs. If you have any specific feedback or features you love about YOLO-NAS, feel free to share!

@andrey-skat
Copy link

@glenn-jocher Compressed file is also some "weights" created by program based on some input data = "dataset". So if GPL program compress the file then non-GPL program can't read it or include in distribution? It's obviously not. Moreover you can't tell what particular program created those weights. Also weights is just set of numbers that can be wrong and can be created by human. You can't tell what created particular set of numbers.

@glenn-jocher
Copy link
Member

@andrey-skat hey there! 😊 You bring up a very thought-provoking perspective on how data (like weights) is generated and utilized. In the context you've described, weights generated by a program are indeed just sets of numbers. They can be interpreted or created in various ways, not exclusively through the use of a specific piece of software.

As for licensing, it generally applies to the software code itself and not the data produced (like compressed files or model weights). So, you're right in pointing out that using weights without directly including or integrating GPL-program code in a distribution doesn't automatically impose GPL constraints on the user.

It's always a good idea to consult with a legal expert for questions tailored to specific use cases, especially when navigating the nuances of open source licenses. Keep exploring and asking great questions! 👍

@Burhan-Q
Copy link
Member

Burhan-Q commented Apr 1, 2024

In my opinion the answer should be - clearly no.

Opinions are not fact and does not absolve anyone of legal obligations with respect to licensing, just like speeding is illegal even if you didn't know what the speed limit was. If you need to know with certainty what it permitted by the licesne, you should consult a lawyer, otherwise you can read the AGPL-3.0 terms, but even after ~20 reads myself, I wouldn't claim a complete understanding of every detail.

What I can tell you is that it specifically covers source code, object code, and corresponding source code, which mean that anything generated from the source code is also covered. It means that the weights themselves are also covered by AGPL-3.0, both the native PyTorch weights and any exported or even duplicated versions of the models.

The intention of AGPL-3.0 is to be a "viral" open-source license. Instead of permitting anyone to make improvements without sharing, AGPL-3.0 requires all related source code, object code, and corresponding source code (that's not already freely available) to be made available publicly under the same license. This is intended to prevent anyone from "secretly" making improvements without sharing them:

The GNU Affero General Public License is designed specifically to ensure that, ..., the modified source code becomes available to the community.

@rwightman
Copy link

@Burhan-Q the issue is more fundamental than AGPL-3.0 or any one specific license and any of the fine print. All of these licenses build on copyright, you start with the copyright that you own and the license specifies what permission you grant as the copyright holder. Without copyright, no license.

Copyright only protects 'works of human creation'. So barring a landmark ruling that says otherwise, most licenses on weights are unlikely to hold if challenged. I'm curious though if, in working through ambiguities on training dataset copyright, limits of fair use etc, the courts may result in a more formal definition on copyrightability of weights in relation to their training data.

There's already case law in relation to outputs of programs, the copyright is not attributable to the program but the user using the program to create. A license that tries to claim otherwise won't legally stand up.

@Burhan-Q
Copy link
Member

Burhan-Q commented Apr 2, 2024

@rwightman your point is taken, and beyond my understanding of what does/doesn't pass for copyrightable material. What I can speak on is the matter of the license applicability to Ultralytics and the related assets, as an employee of the company.

The decision of where or how the copyright is applied, is a scope beyond what this issue seeks to answer which is more practical regarding, "what can I do now?" As it stands today, all model weights are covered by AGPL-3.0 as far as their use or distribution from a business perspective. I'm not a lawyer or studied in law, but this is how our Legal team has advised to speak on this matter.

Although it's my opinion and would not necessarily be meaningful in court, I would assert that the model weights could reasonably be argued to be a "work of a human" as the numerical weights are merely the state of the designed model structure. The model architecture does still exist in the model weights, it's not solely a matrix of numbers with no other information, the layers and their connections are also present. Perhaps the isolated numerical weights are not subject to copyright, but the model weights which include information about the model, would I suspect would be considered copyrightable. Of course my opinion doesn't make it true, but it's the argument I would make; as separating the numerical values component from the model structure of the weights, would render the weights useless.

@geometrikal
Copy link

geometrikal commented Apr 4, 2024

@Burhan-Q Rather than a back and forth about what is and isn't covered, could you just change / update the licence to make it explicit. And if one doesn't exist, maybe now is a good time to introduce it.

@Burhan-Q
Copy link
Member

Burhan-Q commented Apr 4, 2024

Not my call

@ultralytics ultralytics locked and limited conversation to collaborators Apr 4, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested Stale
Projects
None yet
Development

No branches or pull requests

14 participants