Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPDX copyright: make project reuse-compliant #306

Merged
merged 39 commits into from
Sep 30, 2022

Conversation

aesteve-rh
Copy link
Member

@aesteve-rh aesteve-rh commented Aug 31, 2022

Make the whole project reuse-compliant and add
a lint job to check that all files have their copyright.

Signed-off-by: Albert Esteve aesteve@redhat.com

@aesteve-rh aesteve-rh self-assigned this Aug 31, 2022
@aesteve-rh aesteve-rh marked this pull request as draft August 31, 2022 08:21
@aesteve-rh
Copy link
Member Author

Just an idea, since I noticed outdated copyright on recently updated files, and this is something generally difficult to maintain if we don't automate it.

Complemented with a script to update copyright automatically, could make keeping the copyright updated much easier.

Adding this to a CI may seem like extra burden for contributors, but take into account that this only will pop up at the beginning of each year, and relax later on.

Copy link
Member

@mz-pdm mz-pdm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this change would be very useful. It would be better to replace the years and the whole copyright texts with short SPDX identifiers. If you'd like to take such an effort, there was some related discussion in comments to https://gerrit.ovirt.org/c/vdsm/+/116741 .

automation/check-copyright.py Outdated Show resolved Hide resolved
@aesteve-rh
Copy link
Member Author

I did not know that this was discussed before.

In principle, I do not intend to replace the format, just have a simple check to ensure that the year is updated in the modified files in a branch. No global update, but incremental, depending on the files updated. No format check.

I mean, it could be just as simple as looking for the year string in the first two lines, and remove the regex, if we prefer it to be completely format agnostic.

But if there is no need to enforce/check for copyright updates, we can dismiss the PR.

@mz-pdm
Copy link
Member

mz-pdm commented Aug 31, 2022

AFAIK the years do not have a significant legal meaning so although it's good to keep them up-to-date, it's not strictly necessary. It doesn't harm to have a tool to check them but I'm not sure it's worth to maintain it in automation, IMO the resources could be better spent towards the conversion to SPDX and to get rid of the problem completely (which I admit is more difficult but perhaps not that much if we can agree how to do it). Let's see what the others think.

@nirs
Copy link
Member

nirs commented Aug 31, 2022

The best way is to replace the copyrights (removing the years) and the license boilerplate using the ‘reuse’ tool, and add ‘reuse lint’ step to the ci.

https://reuse.software/

see how it integrated in blkhash:

https://gitlab.com/nirs/blkhash/-/blob/master/.reuse/templates/blkhash.jinja2

@aesteve-rh
Copy link
Member Author

I have been playing with the reuse tool. It is really nice indeed. The branch needs some refinement (probably some copyright comments are in the wrong style), but is a very powerful tool.

@mz-pdm
Copy link
Member

mz-pdm commented Sep 2, 2022

Cool, looks promising. It needs some polishing (it sometimes removes more than just the copyright notice) and explanations (do we keep the original copyright notices? will we still have years in SPDX identifiers?) but hopefully it will achieve the desired result.

Why to add LICENSES/ when we already have COPYING?

@aesteve-rh
Copy link
Member Author

aesteve-rh commented Sep 2, 2022

do we keep the original copyright notices? It respects the copyright line with year and company, but removes the rest of the surrounding header. An example of such behaviour (in https://github.com/oVirt/vdsm/blob/5292e3380d205d99ed5c9b650de79069d20ad066/Makefile.am):

# Copyright 2008-2020 Red Hat, Inc.
# SPDX-FileCopyrightText: 2022 Red Hat, Inc.
# SPDX-License-Identifier: GPL-2.0-or-later

Previously:

#
# Copyright 2008-2020 Red Hat, Inc.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
#
# Refer to the README and COPYING files for full details of the license
#

If we prefer to keep the full header, that is configurable (in the template).

will we still have years in SPDX identifiers? That is also configurable. In the current version, I have kept the years in the SPDX identifiers, which is the default behaviour. But we can omit them if it is preferred.

Why to add LICENSES/ when we already have COPYING? LICENSES is the name of the folder where the licenses are downloaded to make the repository reuse-compliant. I checked if it is configurable, but I didn't find anything, it always talks about the LICENSES folder. We may want to remove COPYING (and adapt the main README.md) if we opt for this solution.

@aesteve-rh
Copy link
Member Author

@aesteve-rh aesteve-rh changed the title check-copyright: add automation script SPDX copyright: reuse exploration Sep 2, 2022
@mz-pdm
Copy link
Member

mz-pdm commented Sep 2, 2022

I think we needn't duplicate the copyright notices. So if we have

SPDX-FileCopyrightText: SOMETHING A
SPDX-FileCopyrightText: SOMETHING-B

then there is no need to keep

Copyright SOMETHING-A
Copyright SOMETHING-B

at the same place. But make sure all the copyright lines (I mean those single "Copyright XXXX YYYY" lines, not full headers) are copied to SPDX-FileCopyrightText entries, I could see some non-RH ones were missing.

As for years, I'm not sure what to do about them. We'd like to get rid of updating them, OTOH we should probably not discard the existing ones blindly. Examples in https://spdx.github.io/spdx-spec/v2.3/file-tags/ show entries both with and without dates, https://spdx.github.io/spdx-spec/v2.3/file-information/#8.8 suggests to keep the information as it is, including dates. So I'd keep the dates for now.

If LICENSES/ is required then OK. Whether to keep COPYING in such a case: It defines the default licence, which may be relevant for files without copyright information. So it's probably better to keep it.

IANAL so trying to be on the safe side with my suggestions. Maybe someone else can clarify it.

Copy link
Member

@nirs nirs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is too big to review. Try to covert by in smaller parts (e.g. storage, virt, network, infra).

lib/vdsm/common/glob.py Outdated Show resolved Hide resolved
Makefile.am Outdated Show resolved Hide resolved
lib/vdsm/alignmentScan.py Outdated Show resolved Hide resolved
lib/vdsm/common/glob.py Outdated Show resolved Hide resolved
lib/vdsm/common/glob.py Outdated Show resolved Hide resolved
lib/vdsm/ppc64HardwareInfo.py Outdated Show resolved Hide resolved
lib/vdsm/storage/asyncevent.py Outdated Show resolved Hide resolved
CODEOWNERS Outdated Show resolved Hide resolved
automation/artifacts.repo Outdated Show resolved Hide resolved
static/usr/share/man/man1/vdsm-client.1.license Outdated Show resolved Hide resolved
@aesteve-rh aesteve-rh added this to the ovirt-4.5.3 milestone Sep 5, 2022
Copy link
Member

@nirs nirs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More cleanup needed before this.

lib/vdsm/common/glob.py Outdated Show resolved Hide resolved
.github/workflows/ci.yml Outdated Show resolved Hide resolved
@aesteve-rh aesteve-rh force-pushed the aesteve/check-copyright branch 2 times, most recently from 2d0fe0c to 1a5b490 Compare September 8, 2022 13:42
@nirs
Copy link
Member

nirs commented Sep 8, 2022

reuse lint
make: reuse: Command not found

Please add reuse to the containers before you add reuse to the lint target.

@aesteve-rh
Copy link
Member Author

Requirements are updated already in the first commit, and I built and tested the containers locally.

Once we integrate #311 and #294 I will rebase, build and push the containers, and this will be ready for review :)

@aesteve-rh
Copy link
Member Author

aesteve-rh commented Sep 12, 2022

After aligning with @nirs I postponed #294 to after this PR. I'll ready the PR up, should be good in its current status.
Waiting for #294.

Add GPL copyright header in all markdown
files in the project with the file extension.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Add GPL copyright header to shell
scripts in the project.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Add GPL copyright header to Makefiles[.am]?

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Add GPL copyright header in the root
project folder files.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Add GPL copyright header to the files
in the docker folder.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Add GPL copyright header in all
files with *.yml extension in the project.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Add GPL copyright header to vdsm_hooks folder files.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Add GPL copyright header to files in the
vdsm_log folder.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Add GPL copyright header to all xml files
in the project. These xml are parsed later
and make tests to fail if we add an inlined
copyright headers. In order to avoid this,
create explicit license files for project xmls.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Add GPL copyright header for files with
the .conf extension in the project.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Add GLP license files for json files.
JSON format does not accept comments.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Add GPL copyright header to ax_python_module.m4.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Add GPL copyright header to files in the
static folder.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Add GPL copyright header to files in the
build-aux folder.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Add GPL copyright header to the rest of the
files in the tests folder.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Add GPL copyright header to the files
in the ci folder.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Add GPL copyright header to the rest of
the project files not covered by other commits.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Add reuse environment in tox to run the linter.
Then, add a Makefile target to run the reuse
environment, which allows the pipelines to check
the SPDX copyright headers and ensure that the
project is reuse-compliant.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Document new SPDX headers policy, new
contrib/add-spdx-header.sh script to add
SPDX header to new files, and a few reuse
command examples to handle non-default licenses.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
@mz-pdm
Copy link
Member

mz-pdm commented Sep 29, 2022

Why do we have so many files in Vdsm? :-) Looks OK to me as I skimmed through it. Let's preferably merge this before branching, to avoid troubles with future backports.

lib/vdsm/storage/asyncevent.py Show resolved Hide resolved
build-aux/gitlog-to-changelog Show resolved Hide resolved
lib/vdsm/momIF.py Show resolved Hide resolved
@mz-pdm mz-pdm merged commit 7497a48 into oVirt:master Sep 30, 2022
@aesteve-rh aesteve-rh deleted the aesteve/check-copyright branch September 30, 2022 08:40
@mz-pdm
Copy link
Member

mz-pdm commented Sep 30, 2022

/ost basic-suite-master el9stream

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants