Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Project -> ForeignKey -> RemoteRepository #8259

Merged
merged 4 commits into from Jun 17, 2021

Conversation

humitos
Copy link
Member

@humitos humitos commented Jun 14, 2021

Migrate the relation between Project and RemoteRepository to be a ForeignKey from Project model, so we can attach multiple projects to the same RemoteRepository. This would allow us to import the same repository multiple times and use VCS permissions to manage it.

πŸ§‘β€πŸ­ Work done

  • add Project.remote_repository
  • migrate data from RemoteRepository.project to Project.remote_repository
  • drop RemoteRepository.project
  • change ?expand=project,organization by ?expand=projects,remote_organization on /api/v3/remote/repositories/ endpoint

πŸ“ Notes

  • /api/v3/remote/repositories/?expand=project,organization
  • all test creating RemoteRepository with a project attached have to be updated
  • project.remote_repository does not raise RemoteRepository.DoesNotExist anymore; now it just return None (e.g. readthedocs.builds.tasks.send_build_status)
  • require to change all places where remote_repository.project = project to remote_repository.projects.add(project)

I just starting collecting some notes here, but I'm sure there will be more things to consider and take care of

Closes #8232
Related #8229

@humitos humitos force-pushed the humitos/remoterepository-project-fk branch 3 times, most recently from dfc2572 to 60b491b Compare June 15, 2021 11:21
Copy link
Member Author

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agjohnson please take a close look at this PR and let me know if it does make sense to you.

Comment on lines 6 to 11
def migrate_data(apps, schema_editor):
RemoteRepository = apps.get_model('oauth', 'RemoteRepository')
queryset = RemoteRepository.objects.filter(project__isnull=False).select_related('project')
for rr in queryset:
rr.project.remote_repository = rr
rr.save()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agjohnson this data migration from the old field to the new field is an important change to take a look at carefully.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, this may be slow in production if we have many RemoteRepositoy linked to a Project.

Comment on lines 1816 to 1818
:query string expand: allows to add/expand some extra fields in the response.
Allowed values are ``project`` and ``organization``.
Allowed values are ``projects`` and ``remote_organization``.
Multiple fields can be passed separated by commas.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agjohnson this a change in the API request/response required to make sense with the new relation. It also reserves ?expand=organization to be used for Organization objects.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to say we should keep project around for backwards compatibility, but this API should really only be internal use anyways. So yeah, agreed this seems fine.

@humitos humitos marked this pull request as ready for review June 15, 2021 11:34
@humitos humitos force-pushed the humitos/remoterepository-project-fk branch from 4a325ba to 8160a3d Compare June 15, 2021 11:34
@humitos humitos requested review from saadmk11, agjohnson and a team June 15, 2021 11:34
- add `Project.remote_repository`
- migrate data from `RemoteRepository.project` to `Project.remote_repository`
- drop `RemoteRepository.project`

- change `?expand=project,organization` by
  `?expand=projects,remote_organization` on /api/v3/remote/repositories/
  endpoint
Copy link
Member

@saadmk11 saadmk11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look great! πŸ’―

I think we may want to add remote_repository to the Project ModelAdmin raw_id_fields

raw_id_fields = ('users', 'main_language_project', 'remote_repository',)

Copy link
Contributor

@agjohnson agjohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! If anyone shares my concern about performance here, we might want to find a way to make the migration faster or operate in chunks. It might be possible with a bulk query 🀷

Comment on lines 1816 to 1818
:query string expand: allows to add/expand some extra fields in the response.
Allowed values are ``project`` and ``organization``.
Allowed values are ``projects`` and ``remote_organization``.
Multiple fields can be passed separated by commas.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to say we should keep project around for backwards compatibility, but this API should really only be internal use anyways. So yeah, agreed this seems fine.

queryset = RemoteRepository.objects.filter(project__isnull=False).select_related('project')
for rr in queryset.iterator():
rr.project.remote_repository = rr
rr.save()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took me a second to follow this logic, but I think this looks good. I suppose I would have concerns about performance here, we will have a few thousand remote repositories to step through individually.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just checked this: 22k. We should be fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double-checked this logic and it didn't work.

rr.project.remote_repository = rr

That doesn't seem to keep the relation.

I also tried,

rr.project.remote_repository_id = rr.pk

I don't understand why.

Copy link
Member Author

@humitos humitos Jun 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meh, I found that I was just saving the wrong object.

We need either:

rr.project.save()

or

rr.projects.add(rr.project)

However, both are producing an error that I'm not sure to understand 😞

database_1  | 2021-06-17 12:05:45.877 UTC [193] ERROR:  cannot ALTER TABLE "projects_project" because it has pending trigger events
database_1  | 2021-06-17 12:05:45.877 UTC [193] STATEMENT:  SET CONSTRAINTS "oauth_remotereposito_project_id_246c363d_fk_projects_" IMMEDIATE; ALTER TABLE "oauth_remoterepository_2020" DROP CONSTRAINT "oauth_remotereposito_project_id_246c363d_fk_projects_"

@humitos humitos merged commit a489572 into master Jun 17, 2021
@humitos humitos deleted the humitos/remoterepository-project-fk branch June 17, 2021 10:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change RemoteRepository relationship to Project to a many-to-many relationsihp
3 participants