Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate repository #19925

Closed
AyushCloud opened this issue Jan 30, 2024 · 17 comments
Closed

Duplicate repository #19925

AyushCloud opened this issue Jan 30, 2024 · 17 comments

Comments

@AyushCloud
Copy link

Hello,

In Harbor 2.8.4 we could see the duplicate repository.

Is this a known issue ?
PFA the screenshot.

Thank You,
Ayush
dup-repe

@AyushCloud
Copy link
Author

Hello,

In Postgresql query we could see it is listing both "go_user_task" and "go-user-task"

May be postgresql is failed differentiate between "_" and "-"

harbor_db=> select * from repository where name LIKE 'components/go-user%'; 
 repository_id | name | project_id | description | pull_count | star_count | creation_time | update_time 
---------------+--------------------------+------------+-------------+------------+------------+----------------------------+---------------------------- 
 1870 | components/go-user-task | 464 | | 45 | 0 | 2024-01-19 11:45:21.786858 | 2024-01-28 00:40:34.245535 
 1929 | components/go-user-task | 464 | | 23 | 0 | 2024-01-25 12:58:28.767893 | 2024-01-30 16:17:49.893138 
(2 rows) 

select * from repository where name LIKE 'components/go_user%'; 
 repository_id | name | project_id | description | pull_count | star_count | creation_time | update_time 
---------------+--------------------------+------------+-------------+------------+------------+----------------------------+---------------------------- 
 1866 | components/go_user_task | 464 | | 41 | 0 | 2024-01-16 15:57:26.64285 | 2024-01-30 15:46:25.162086 
 1870 | components/go-user-task | 464 | | 45 | 0 | 2024-01-19 11:45:21.786858 | 2024-01-28 00:40:34.245535 
 1929 | components/go-user-task | 464 | | 23 | 0 | 2024-01-25 12:58:28.767893 | 2024-01-30 16:17:49.893138 
 1930 | components/go_user_task | 464 | | 1 | 0 | 2024-01-25 14:42:32.075659 | 2024-01-30 16:46:40.25571 
(4 rows) 

@MinerYang
Copy link
Contributor

Nah, it's do differentiate - and _.
Uploading Screenshot 2024-02-02 at 15.31.45.png…

Could you describe your repository table to determin if there's any UNIQUE CONSTRAINT

psql
\c registry
\d repository

It should looks like this

Indexes:
    "repository_pkey" PRIMARY KEY, btree (repository_id)
    "repository_name_key" UNIQUE CONSTRAINT, btree (name)

@AyushCloud
Copy link
Author

Hello,

harbor_db=> \d repository

Table "public.repository"
Column | Type | Collation | Nullable | Default
---------------+-----------------------------+-----------+----------+---------------------------------------------------
repository_id | integer | | not null | nextval('repository_repository_id_seq'::regclass)
name | character varying(255) | | not null |
project_id | integer | | not null |
description | text | | |
pull_count | integer | | not null | 0
star_count | integer | | not null | 0
creation_time | timestamp without time zone | | | CURRENT_TIMESTAMP
update_time | timestamp without time zone | | | CURRENT_TIMESTAMP
Indexes:
"repository_pkey" PRIMARY KEY, btree (repository_id)
"repository_name_key" UNIQUE CONSTRAINT, btree (name)
Triggers:
repository_update_time_at_modtime BEFORE UPDATE ON repository FOR EACH ROW EXECUTE FUNCTION update_update_time_at_column()

It is same.

The issue arises when we have a repo name "go_user_task" and rename to "go-user-task".

Also i saw a similar post #19468.
For Harbor v2.8.4 version

Thank You,
Ayush

@MinerYang
Copy link
Contributor

Could you describe how do you exactly rename repo name from "go_user_task" to "go-user-task".

@AyushCloud
Copy link
Author

Hello,

We accidently use the repo name "go-user-task" instead of "go_user_task", then tag and push the image.
So i am just guessing it could be because of "_"/ "-". But i am not sure, why it is showing two repo with same name in UI and backend as well. And when we try to delete it then it says "repo not exist"

Thanks,
Ayush

@zyyw
Copy link
Contributor

zyyw commented Feb 5, 2024

@AyushCloud could you please provide us with steps how you end up with two components/go_admin_task repositories as you posted in this following image:
go_admin_task

I tried the following steps intending to reproduce this scenario:

  • docker tag alpine:2.6 harbor-domain/components/go-user-task:2.6
  • docker tag alpine:2.6 harbor-domain/components/go_user_task:2.6
  • docker push harbor-domain/components/go-user-task:2.6
  • docker push harbor-domain/components/go_user_task:2.6

but there are two repositories with different name, such as components/go-user-task and components/go_user_task. And they are different names, not the same records occurs two times.
Screenshot 2024-02-05 at 12 25 29

@AyushCloud
Copy link
Author

Hello,

Even i tried to reproduce, but it is now showing any duplicate repo.

select * from repository where name LIKE 'components/go_admin%';
repository_id | name | project_id | description | pull_count | star_count | creation_time | update_time
---------------+--------------------------+------------+-------------+------------+------------+----------------------------+----------------------------
1866 | components/go_admin_task | 464 | | 41 | 0 | 2024-01-16 15:57:26.64285 | 2024-01-30 15:46:25.162086
1870 | components/go-admin-task | 464 | | 45 | 0 | 2024-01-19 11:45:21.786858 | 2024-01-28 00:40:34.245535
1929 | components/go-admin-task | 464 | | 23 | 0 | 2024-01-25 12:58:28.767893 | 2024-01-30 16:17:49.893138
1930 | components/go_admin_task | 464 | | 1 | 0 | 2024-01-25 14:42:32.075659 | 2024-01-30 16:46:40.25571

Based on the time stamp, i could say that, on 25th January we did an upgrade to v2.8.4.
So i believe the repo already exist and post upgrade showing duplicate repo.

Thanks,
Ayush

@zyyw
Copy link
Contributor

zyyw commented Feb 5, 2024

please follow this workaround:

closing it now.

@zyyw zyyw closed this as completed Feb 5, 2024
@SunilDD
Copy link

SunilDD commented Feb 23, 2024

Regarding: Duplicate Repository

I encountered the same issue after upgrading from Harbor version v2.8.2-d4c34dcc to v2.8.4-ad3e767d regarding image repository management.

Before the upgrade, I had several images pushed with different repository names, some containing underscores ('_') and hyphens ('-'). For instance, repositories like go_admin_task/image:1.0, go-admin-task/image:1.0, before-upgrade:1.0, and before_upgrade:2.0 were present.

After the upgrade, I noticed that images from repositories with names differing only in the use of underscores or hyphens were unexpectedly deleted. Specifically, repositories with names like go_admin_task/image:1.0 and go-admin-task/image:1.0, as well as before-upgrade:1.0 and before_upgrade:2.0, were affected. Interestingly, repositories without such naming discrepancies, like temp_admin/image:1.0 and temp_admin/image:2.0, remained intact.

Furthermore, attempts to push images to repositories containing underscores or hyphens post-upgrade resulted in the creation of duplicate repositories. This behavior was not observed prior to the upgrade.

I believe this issue warrants attention from the Harbor community as it impacts repository management consistency and may lead to unintended data loss or duplication. Any assistance or insights into resolving this matter would be greatly appreciated.

Thank you
Sunil

@zyyw
Copy link
Contributor

zyyw commented Feb 26, 2024

Hello,

In Postgresql query we could see it is listing both "go_user_task" and "go-user-task"

May be postgresql is failed differentiate between "_" and "-"

harbor_db=> select * from repository where name LIKE 'components/go-user%'; 
 repository_id | name | project_id | description | pull_count | star_count | creation_time | update_time 
---------------+--------------------------+------------+-------------+------------+------------+----------------------------+---------------------------- 
 1870 | components/go-user-task | 464 | | 45 | 0 | 2024-01-19 11:45:21.786858 | 2024-01-28 00:40:34.245535 
 1929 | components/go-user-task | 464 | | 23 | 0 | 2024-01-25 12:58:28.767893 | 2024-01-30 16:17:49.893138 
(2 rows) 

select * from repository where name LIKE 'components/go_user%'; 
 repository_id | name | project_id | description | pull_count | star_count | creation_time | update_time 
---------------+--------------------------+------------+-------------+------------+------------+----------------------------+---------------------------- 
 1866 | components/go_user_task | 464 | | 41 | 0 | 2024-01-16 15:57:26.64285 | 2024-01-30 15:46:25.162086 
 1870 | components/go-user-task | 464 | | 45 | 0 | 2024-01-19 11:45:21.786858 | 2024-01-28 00:40:34.245535 
 1929 | components/go-user-task | 464 | | 23 | 0 | 2024-01-25 12:58:28.767893 | 2024-01-30 16:17:49.893138 
 1930 | components/go_user_task | 464 | | 1 | 0 | 2024-01-25 14:42:32.075659 | 2024-01-30 16:46:40.25571 
(4 rows) 

Can you please create an issue in PostgreSQL project asking why it lists both components/go_user_task and components/go-user-task when running select * from repository where name LIKE 'components/go_user%';. I believe this is not a Harbor issue, because you are actually running a raw SQL against a PostgreSQL data set.

@zyyw zyyw reopened this Feb 26, 2024
@stonezdj
Copy link
Contributor

stonezdj commented Feb 27, 2024

Another question: Has ever this Harbor instance been upgraded from a version older than v2.3.0?

see the faq: https://github.com/goharbor/harbor/wiki/Harbor-FAQs#duplicate-repository-name-in-the-same-project

@AyushCloud
Copy link
Author

Hello,

We found this issue in our production environment which is upgraded from time to time.
But then we were able to reproduce this issue on a freshly deployed Harbor version v2.7.x and then upgrade to v2.8.4.

The main issue is that when two repo exist with same name and only difference is "_" "-", then post upgrade the images are deletes and when we try to push any image to those repos then they are duplicated which is visible in database queries

So it seems the main issue is that post upgrade the repos should not be empty.

Thank You,
Ayush

Thank You,
Ayush

@SunilDD
Copy link

SunilDD commented Feb 29, 2024

Steps to Reproduce the issue:

Before Upgrade: Version -> v2.8.2

  • Uploaded images to repositories with similar names like go_admin_task and go-admin-task, differing only in the use of '-' and '_'.
  • Uploaded images to a single repository, either temp_admin or temp-admin, containing either '-' or '_'.

After Upgrade: Version -> v2.8.4

  • Upon attempting to access images from repositories like go_admin_task and go-admin-task, images were found to be deleted, despite being present before the upgrade.
  • Images in the temp_admin repository were still accessible post-upgrade.
  • Trying to push images to repositories with names like go_admin_task and go-admin-task resulted in the creation of duplicate repositories.

These observations suggest a discrepancy in repository management post-upgrade, specifically regarding repositories with similar names distinguished only by the use of '-' and '_'.

Thank you,
Sunil

@SunilDD
Copy link

SunilDD commented Mar 29, 2024

@zyyw Any update on above issue (Repository duplication) ?

@MinerYang
Copy link
Contributor

Copy link

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

@github-actions github-actions bot added the Stale label May 28, 2024
Copy link

This issue was closed because it has been stalled for 30 days with no activity. If this issue is still relevant, please re-open a new issue.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants