Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start developing Debate XML import/export #1076

Merged
merged 28 commits into from
Jan 13, 2021

Conversation

tienne-B
Copy link
Member

This commit starts creating an interface for the import/export and implementation of the Debate XML specification. Here, a side-wide view for importing an XML file is created, with a consumer link for asynchronous processing.

More to come

This commit starts creating an interface for the import/export and
implementation of the Debate XML specification. Here, a side-wide view
for importing an XML file is created, with a consumer link for async
processing.
@tienne-B tienne-B changed the title Start developing Debate XML import Start developing Debate XML import/export Apr 30, 2019
This commit adds a tournament url for tournament export with a stub
page.
@tienne-B
Copy link
Member Author

I'm a bit concerned I'm putting the cart before the horse here, as the schema has not been reviewed or discussed: TabbycatDebate/DebateXML#4. @czlee, @philipbelesky: Mind looking it over (even just cursorily)? I should get back to this in a week or so.

@philipbelesky
Copy link
Member

Happy to look over it, although might need to wait a few days.

@philipbelesky
Copy link
Member

I've mostly commented over in the DTA repo, but just was curious about the desire to write an importer and whether you were planning to do that before/after/simultaneous with the exporter? It's certainly an interesting idea. Initially I pictured the point of DTA as enabling much more of an archival-storage/ data-vis purpose, but I guess having a Tabbycat importer means that even if the spec doesn't take off more broadly we would have a robust way of mitigating against data loss and/or combining tournaments into a single instance.

@tienne-B
Copy link
Member Author

An importer would be useful for me as a way to collate various tournaments previously on individual sites to a central site, of which is needed by SUCDI for end-of-year prizes. Having the importer would also be useful as TC already has some data visualisations and analyses (and result views) that would make the archive format more accessible.

I am creating the importer simultaneously with the exporter as I think having the importer gives a visible meaning for the exporter (although that isn't the main point); aiding with adoption.

@tienne-B tienne-B force-pushed the debate-xml branch 8 times, most recently from 26887db to b632ee9 Compare May 22, 2019 21:35
This commit adds an exporter for single tournaments into an XML format
as specified in dta-spec.
@tienne-B
Copy link
Member Author

So exporting seems to be working well but with N+1 SQL inefficiencies that I'm finding hard to resolve in the .add_rounds() method. What prefetches on round_set should I do so that DebateResult does not fetch from the database for each debate?

In order to expose the debug toolbar to view the SQL queries, I add a typo in another method.

@philipbelesky
Copy link
Member

Hmm, I think to avoid DebateResult set being fetched the debates need to be gathered without any reference to their current status (and obviously the winners involved). Are the N+1 issues severe, or just inelegant? Given the scope of the exporter I wonder if it might be easier to accept some of them.

@tienne-B
Copy link
Member Author

Sorry, I don't understand on the points of "fetching without any reference of current status", nor what would be the threshold between inelegance and being severe. On the first, is it an inability to prefetch a table in various ways (e.g. from speakerscore: debateteam__team__speaker and speaker)?

A new page is added detailing the XML format and how to import/export
it to/from Tabbycat. A distinction has been made between it and database
backups.
@tienne-B tienne-B force-pushed the debate-xml branch 4 times, most recently from 88c173f to 051c100 Compare June 3, 2019 22:21
Also separate methods for adding debates, add gender to people, and
removing team short name.
This commit starts an importer for Debate XML (The "Agora Project") with
support for all that can be exported by Tabbycat. Some considerations
are marked in the documentation page about this format.
@tienne-B
Copy link
Member Author

@czlee: From #1199, should .bulk_create() be avoided here in favour of one-by-one?

@czlee
Copy link
Member

czlee commented Sep 2, 2019

Umm… yeah. Unless it's painfully slow, like, more than a minute or so. But otherwise I think I'd rather display a progress bar and have it more robust to future changes in .save() functions. Otherwise we'd have no choice but to violate DRY (or promise never to use .save(), pre_save and post_save).

I'm open to discussion on this, though.

@tienne-B
Copy link
Member Author

tienne-B commented Sep 2, 2019

Understood. I'll prepare a patch for performance comparison (after #1180 as it would improve result importation). Having a progress bar would be quite difficult as import does not use consumers.

@czlee
Copy link
Member

czlee commented Sep 3, 2019

Hmm, should the import use consumers? Normally this is best practice for anything that takes a nontrivial amount of time?

(It's fine if you don't want to do this for a first implementation! Just thinking out loud.)

Some methods have also been added to DebateResult classes for the
change.
@tienne-B
Copy link
Member Author

tienne-B commented Oct 5, 2019

This PR depends on #1180 for the use of DebateResult for XML. The refactor should be merged before this PR, as merging this will indirectly merge the other.

To be able to use new DebateResult classes in importing/exporting.
This commit replaces the uses of `.bulk_create()`, instead saving each
DB object one-by-one. This is to trigger the `.save()` of the models and
the triggers.

The results importer has not been touched, and will be modified to make
use of DebateResult in another commit.
This commit replaces all the logic to determine the result of a debate,
such as the margin/splits, etc. with completing a DebateResult object
and saving that.

This commit refers to methods created as part of TabbycatDebate#1180.
This is necessary as get_winner() has the same effect.
This commit fixes the DebateResult generation as well as some typos and
omissions caused from previous changes.
Also rename uses of 'test score'.
As booleans must be either 'true' or 'false' in XML (note the use of
lowercase), rather than the capitalized keywords used in Python.
# Conflicts:
#	config/requirements_core.txt
#	tabbycat/importer/urls.py
#	tabbycat/importer/views.py
#	tabbycat/options/views.py
#	tabbycat/results/dbutils.py
#	tabbycat/results/result.py
#	tabbycat/results/utils.py
#	tabbycat/urls.py
Even though email addresses and conflicts would not be typically added
to an exported XML, the importer should still use them if they exist.
@tienne-B tienne-B merged commit 4215998 into TabbycatDebate:develop Jan 13, 2021
@tienne-B tienne-B deleted the debate-xml branch January 13, 2021 21:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants