Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate to using django-popolo instead of pombola's core models #1594

Open
3 of 10 tasks
mhl opened this issue Jan 19, 2015 · 5 comments
Open
3 of 10 tasks

migrate to using django-popolo instead of pombola's core models #1594

mhl opened this issue Jan 19, 2015 · 5 comments
Labels

Comments

@mhl
Copy link
Contributor

mhl commented Jan 19, 2015

I'm creating this new ticket to more accurately represent the work involved in this. (We had been using #1551 as a placeholder for this job.) This plan probably needs some work.

The great advantage in doing this is that SayIt, when embedded in Pombola, and the rest of Pombola's code would use to the same Person objects. This would remove complex layers of indirection and state, and be a good step in reducing the complexity of maintaining the Pombola ZA scrapers. It's also a good move in terms of code reuse and moving Pombola to be more like a collection of components.

  • Subclass the django-popolo models (multi-table inheritance) and add any missing fields which were in the Pombola core models.
  • Write migrations to migrate data from the core Pombola models to the new django-popolo models.
  • Migrate any other models that refer to the core Pombola models either through ForeignKey or GenericForeignKey
  • Make any use of the original Pombola core models an error (e.g. by replacing the Person class with class Person: pass
  • Update the models that Elasticsearch is indexing (via Haystack) to point to the new models
  • Update all code that uses the original models until the tests pass
  • Update all the rest of the code that uses the original models but isn't covered by tests (adding tests at the time if possible)
  • Update Pombola ZA's SayIt integration to eliminate the SASpeakerRedirectView and the linking of people between saying and Pombola via PopIt IDs (see ZA: improve linking between SayIt speakers and Pombola persons #1551 )
  • Drop the tables with the old core Pombola models
  • 🍸 🍷 🎆
@davea
Copy link
Member

davea commented Feb 2, 2015

A summary of the progress so far (in 1594-django-popolo-migration):

  • Fields required by Pombola and not provided by django-popolo have been identified
  • Subclasses of popolo.models.{Organization,Person,Post} have been created as pombola.core.Popolo{Organization,Person,Post} with the missing fields added.
  • core.Organisation.name needed 200 chars but popolo.Organization.name is defined with 128 max_length, and you can't redefine fields in model subclasses so we're using our own fork of django-popolo with max_length set to 4000 for a few fields.
  • A WIP migration has been created that so far migrates the 'simple' fields for Organisations along with the Contact relations (which are migrated to popolo's ContactDetail model).

I have been keeping some notes whilst working through the process.

@davea
Copy link
Member

davea commented Feb 2, 2015

@dracos points out that mysociety/sayit@46b4dc8 might be handy, too.

mhl added a commit that referenced this issue Feb 2, 2015
…kers

We have changed the form of IDs in the South African PopIt instance
several times, but without removing the entries corresponding to old
PopIt IDs from the popit-django popit_person table. This meant that the
name resolution on importing new committee proceedings could associate
the speeches with various SayIt speakers corresponding to different
entries in that table.

This commit introduces a script to hopefully clean up that mess; it
tries to remove any row with an deprecated format ID from the
popit_person table, safely removing any SayIt Speaker associated with
that row.

This should fix #1597 and hopefully #1601.

The underlying issue (that Pombola ZA stores the identity of people in
five different places) is being addressed by #1594.
mhl added a commit that referenced this issue Feb 2, 2015
…kers

We have changed the form of IDs in the South African PopIt instance
several times, but without removing the entries corresponding to old
PopIt IDs from the popit-django popit_person table. This meant that the
name resolution on importing new committee proceedings could associate
the speeches with various SayIt speakers corresponding to different
entries in that table.

This commit introduces a script to hopefully clean up that mess; it
tries to remove any row with an deprecated format ID from the
popit_person table, safely removing any SayIt Speaker or EntityName
associated with that row. It also removes any rows from this table that
can't be associated with any Pombola core Person any more, so long as
they are not referenced any more.

This should fix #1597 and hopefully #1601.

The underlying issue (that Pombola ZA stores the identity of people in
five different places) is being addressed by #1594.
mhl added a commit that referenced this issue Feb 2, 2015
…kers

We have changed the form of IDs in the South African PopIt instance
several times, but without removing the entries corresponding to old
PopIt IDs from the popit-django popit_person table. This meant that the
name resolution on importing new committee proceedings could associate
the speeches with various SayIt speakers corresponding to different
entries in that table.

This commit introduces a script to hopefully clean up that mess; it
tries to remove any row with an deprecated format ID from the
popit_person table, safely removing any SayIt Speaker or EntityName
associated with that row. It also removes any rows from this table that
can't be associated with any Pombola core Person any more, so long as
they are not referenced any more.

This should fix #1597 and hopefully #1601.

The underlying issue (that Pombola ZA stores the identity of people in
five different places) is being addressed by #1594.
@mhl mhl self-assigned this Aug 4, 2015
@mhl
Copy link
Contributor Author

mhl commented Aug 10, 2015

@duncanparkes and I have done some more work on the data migration, adding migrations for the rest of Person's relationships as well as Position, Organisation, Identifier, and AlternativePersonName.

Those that still need attention include:

  • OrganisationRelationship (might not be in django-popolo)
  • Source (except for contact detail sources, which have been dealt with)
  • Those models in other applications that reference the core models, e.g. speakers in hansard, budgets, members interests, etc. etc.

@mhl
Copy link
Contributor Author

mhl commented Aug 10, 2015

After a quick survey, the other models that refer to the core objects that are being changed in this migration are:

  • hansard.models.Entry.speaker is a foreign key to Person
  • hansard.models.Alias.person is a foreign key to Person
  • tasks.models - has a generic foreign key which would need to be checked
  • images.models - has a generic foreign key, but that should have been dealt with in the Person / Organisation migrations
  • budgets.models - has a generic foreign key which would need to be checked and a foreign key to Organisation
  • pombola.bills.models.Bill.sponsor is a foreign key to Person
  • spinner.models - has a generic foreign key which would need to be checked
  • pombola.members_interests.models.Entry.person is a foreign key to Person
  • pombola.scorecards.models.Entry has a generic foreign key.
  • pombola.slug_helpers.SlugRedirect has a generic foreign key which would need to be migrated
  • pombola.core.models.Contact - check that there are no Contacts that refer to anything other than a Person or Organisation
  • pombola.core.models.InformationSource has a generic foreign key which would need to be migrated
  • pombola.core.models.Place.organisation is a foreign key to Organisation
  • pombola.core.models.ParliamentarySession.house is a foreign key to Organisation
  • pombola.core.models.OrganisationRelationship has two foreign keys to Organisation

@mhl
Copy link
Contributor Author

mhl commented Aug 27, 2015

Things to check:

  • Are the new Haystack search indices being used?
  • We'll need to add Pombola's nice admin enhancement (e.g. inline adding of alternative names and positions on the person admin page) to django-popolo
  • Can we extend the django-popolo admin classes in Pombola? (Presumably yes, they're just classes, but are there any multi-table inheritance gotchas?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants