Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically connect source code repository to package on upload operation #27318

Closed
wants to merge 14 commits into from

Conversation

ghost
Copy link

@ghost ghost commented Sep 27, 2023

PR contains corrections to make automatic connection of repository with source code to packages when they are uploaded, in case a user or organization has a repository with a name matching the package name.

Possible scenario:
The user has a repository called "gitea", when the user uploads a package called "gitea" to the container registry, package property will be automatically connected to repository with the source code.

image

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Sep 27, 2023
@pull-request-size pull-request-size bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Sep 27, 2023
@github-actions github-actions bot added modifies/api This PR adds API routes or modifies them topic/packages labels Sep 27, 2023
@ghost ghost changed the title Automatically connect source code repository to package on upload Automatically connect source code repository to package on upload operation Sep 27, 2023
@lunny lunny added this to the 1.22.0 milestone Sep 28, 2023
@@ -119,6 +120,17 @@ func getOrCreateUploadVersion(ctx context.Context, pi *packages_service.PackageI
log.Error("Error setting package property: %v", err)
return err
}

repository, err := repo_model.GetRepositoryByOwnerAndName(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use package name as repository name to search the repository?

Copy link
Author

@ghost ghost Oct 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. 
Let's say "bob" have a git repository called "uno". This repository contains source code for the alpine/debian/rpm/container/etc package.
Once "bob" uploads a package (of any type) called "uno" a repository with the source code will be automatically connected to the package, search across repositories is made using package name.

image

@ghost ghost mentioned this pull request Oct 7, 2023
@KN4CK3R
Copy link
Member

KN4CK3R commented Oct 8, 2023

While this approach may work I don't think we should use it. Most package types have a dedicated field in the metadata which can contain an url to a repository. We should use that and I think there is a (draft/closed?) PR already.

@ghost
Copy link
Author

ghost commented Oct 8, 2023

@KN4CK3R

This approach should be more unified since different Gitea instances might have repositories with specific projects and related packages, which might not necessarily point to a repository on that concrete instance.

Some examples of Arch Linux packages:

~/iso main !3 > pack -Qi zlib
Name            : zlib
Version         : 1:1.3-1
Description     : Compression library implementing the deflate compression method found in gzip and PKZIP
Architecture    : x86_64
URL             : https://www.zlib.net/
~/iso main !3 > pack -Qi zsh 
Name            : zsh
Version         : 5.9-4
Description     : A very advanced and programmable command interpreter (shell) for UNIX
Architecture    : x86_64
URL             : https://www.zsh.org/
~/iso main !3 > pack -Qi gtk3              
Name            : gtk3
Version         : 1:3.24.38-1
Description     : GObject-based multi-platform GUI toolkit
Architecture    : x86_64
URL             : https://www.gtk.org/

Current approach should be applicable to situations where different instances have copies of the same project and provide it via package registries.

@lafriks
Copy link
Member

lafriks commented Oct 8, 2023

I agree with @KN4CK3R that most package types have specific field for this and that should be used instead of guessing by name

@ghost
Copy link
Author

ghost commented Oct 8, 2023

@lafriks

I wanted to clarify a couple details.

  • The operation is executed only on new uploads.
  • If a git repository with a matched name is not found, it won't stop the upload operation.
  • Connected repository does not affect the 'Project website' link in the package description; it remains the same.
if created { // Only new package uploads
	if _, err := packages_model.InsertProperty(ctx, packages_model.PropertyTypePackage, p.ID, container_module.PropertyRepository, strings.ToLower(pi.Owner.LowerName+"/"+pi.Name)); err != nil {
		log.Error("Error setting package property: %v", err)
		return err
	}

	repository, err := repo_model.GetRepositoryByOwnerAndName(
		ctx, pi.Owner.Name, p.Name,
	)
	if err == nil { // If repository is not found, it won't cancel upload
		err = packages_model.SetRepositoryLink(ctx, p.ID, repository.ID)
		if err != nil {
			log.Error("Error linking source code repo to container: %v", err)
			return err
		}
	}	
}

Example gitea arch package:

Screenshot from 2023-10-08 19-49-11

Link to repository in gitea instance and project website might differ, but automatic repository connection on package upload operation shouldn't be a problem.

@ghost
Copy link
Author

ghost commented Oct 9, 2023

@lafriks @KN4CK3R

I can rewrite it in such a way that the repository will be automatically connected only if the package metadata URL points to the git repository that exists in the gitea instance where the package is uploaded.

But that approach is not generic for all package types (since the URL field differs across packages). Some packages don't have a project URL or specify it differently. Also, this approach will not work for project forks on the same instance (when users upload packages to forked repositories), since the URL in the metadata will be the same and the repository won't be connected by package name.

Another solution might be to perform two checks (first by project URL from metadata and second by package name) for repository connection, but it still won't work with package uploads to forked repositories.

Won't take much time to create new draft, which approach would be better?

@lunny
Copy link
Member

lunny commented Oct 9, 2023

@lafriks @KN4CK3R

I can rewrite it in such a way that the repository will be automatically connected only if the package metadata URL points to the git repository that exists in the gitea instance where the package is uploaded.

But that approach is not generic for all package types (since the URL field differs across packages). Some packages don't have a project URL or specify it differently. Also, this approach will not work for project forks on the same instance (when users upload packages to forked repositories), since the URL in the metadata will be the same and the repository won't be connected by package name.

Another solution might be to perform two checks (first by project URL from metadata and second by package name) for repository connection, but it still won't work with package uploads to forked repositories.

Won't take much time to create new draft, which approach would be better?

If it's an external package, it's right to not link to a repository in this Gitea instance.

@ghost
Copy link
Author

ghost commented Oct 9, 2023

@lunny

Some cases will remain uncovered with a different approach.

  1. Automatically connect uploaded package to related forked repository

That might be handy to share modified versions of built software in packaged format.

  1. Automatically connect packages which don't specify project URL (containers for example)

Most of the time, Docker containers are built without labels for source, license, URL, or description. But it is always possible to connect them using image tags or package names.

  1. Automatically connect packages which specify project webpage instead of git URL

None of the packages that specify a webpage or documentation URL as the project homepage would be connected automatically. Also, github/gitlab packages won't be connected.


Users would always be able to connect the repository to the package manually in the UI. Automating this process will make it simpler.

@lafriks
Copy link
Member

lafriks commented Oct 9, 2023

For different packages there will be different places to look for metadata:

  • container images will have org.opencontainers.image.source that will point to source code repository
  • npm package.json will have repository field
    etc

These should be used to automatically link to repository

@ghost
Copy link
Author

ghost commented Oct 10, 2023

@lafriks

Packages might provide the following properties:

  1. Origin repository: always points to the git repository, the main repository for the project. (Might not be possible to create SQL relation in gitea's database, but link can always be provided.)
  2. Project name or tag—which might be related to the repository in Gitea, if such exists. (Project name or tag has bigger chance to create SQL relation in db in specific user scope, users can only bind packages to repositories they have access to.)
  3. Project homepage: points to the project website, might not be related to the git repository.

I would suggest following solution. It should be possible to specify all options; it may look like this:

Screenshot from 2023-10-10 04-44-40

It would provide information about the source code origin, official project website and ability to create SQL relation between package and repository in Gitea instance automatically.

Copy link
Member

@a1012112796 a1012112796 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, suggest add a config option for each user

@ghost
Copy link
Author

ghost commented Nov 8, 2023

@a1012112796

Is it accurate to say that putting up an automatic repository connection as an option with an extra flag in the user settings is the plan?

@lafriks
Copy link
Member

lafriks commented Nov 8, 2023

First step should definitely be this as this is the most common case, than we can improve upon this for edge cases

For different packages there will be different places to look for metadata:

container images will have org.opencontainers.image.source that will point to source code repository
npm package.json will have repository field
etc

These should be used to automatically link to repository

@pull-request-size pull-request-size bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Nov 9, 2023
@ghost ghost force-pushed the package-source-code-repo-autobind branch from e1e97a1 to c650a33 Compare November 9, 2023 14:46
@ghost
Copy link
Author

ghost commented Nov 10, 2023

@lafriks

I got 2 implementations. First, this PR should work for these cases.
Second is in this branch: it requires modification of package file creation parameters and methods, alongside error processing on the top level (it depends on this PR, also i am not sure about the http response code when user does not have permission for repository packages), but should result in less code added to the final version and will allow transaction rollback in case an error happens while connecting to the repository.

@ghost ghost marked this pull request as draft November 14, 2023 03:15
@ghost ghost force-pushed the package-source-code-repo-autobind branch from c3bdf53 to c650a33 Compare November 19, 2023 00:05
@pull-request-size pull-request-size bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 22, 2023
@ghost ghost marked this pull request as ready for review November 22, 2023 20:31
@Renari
Copy link

Renari commented Jan 15, 2024

I proposed a different solution to this that would not require the package and repo name to match:
#28808

Simply the api endpoint would contain the repo name:
https://gitea.example.com/api/packages/{owner}/{repo}/maven

I'm not sure if this would be harder to implement since I'm unfamiliar with the gitea codebase atm.

@ghost
Copy link
Author

ghost commented Jan 18, 2024

@Renari

I completely agree with the proposed approach, except for changing the link format.

It seems to me that to reduce the amount of logic, it might be better to make a unified solution that could work for all built-in registries. An approach with headers should allow this.

@Renari
Copy link

Renari commented Jan 18, 2024

#23851

Also adds the ability to link via an API endpoint, while not the same as having it link automatically you would be able to do the links programmatically, I prefer this solution instead of making 2 API requests however.

@pull-request-size pull-request-size bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 19, 2024
d1nch8g added 2 commits January 22, 2024 18:25
…sitory after each package upload operation, enhanced error processing, fixed repository get function (not to require owner id), removed all logic from package upload method
@ghost ghost force-pushed the package-source-code-repo-autobind branch from 5fce512 to 2ef3073 Compare January 22, 2024 12:57
@ghost
Copy link
Author

ghost commented Jan 22, 2024

@KN4CK3R @lafriks @lunny @Renari

Changed the implementation to use request headers for repository connections instead of metadata fields.
This allows to keep the same approach for repository connection for all package registries and wouldn't break compatibility with existing API endpoints.

@lafriks
Copy link
Member

lafriks commented Jan 22, 2024

I don't quite get why api would have to be changed for automatic mapping using package metadata?

…d and package metadata field if exists and provided in other cases
@ghost ghost force-pushed the package-source-code-repo-autobind branch from c4fcdfd to e4f162e Compare January 24, 2024 12:29
@lunny lunny modified the milestones: 1.22.0, 1.23.0 Mar 29, 2024
@ghost ghost closed this by deleting the head repository May 13, 2024
@GiteaBot GiteaBot removed this from the 1.23.0 milestone May 13, 2024
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. modifies/api This PR adds API routes or modifies them size/L Denotes a PR that changes 100-499 lines, ignoring generated files. topic/packages
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants