Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip indexing for some files #2330

Merged

Conversation

mccalluc
Copy link
Member

@mccalluc mccalluc commented Nov 6, 2017

Fix #2106? Does this seem like a reasonable approach? I still want to:

  • Simplify the solr queries we construct, since these documents won't even be in the index now. (ie, add Remove filetypes fq #2288 back in.)
  • Move this list to configuration.
  • Review other lists of hard-coded document types.

@codecov-io
Copy link

codecov-io commented Nov 6, 2017

Codecov Report

Merging #2330 into develop will decrease coverage by 0.97%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2330      +/-   ##
===========================================
- Coverage    51.31%   50.33%   -0.98%     
===========================================
  Files          389      381       -8     
  Lines        26981    26433     -548     
  Branches      1246     1240       -6     
===========================================
- Hits         13844    13306     -538     
+ Misses       13137    13127      -10
Impacted Files Coverage Δ
refinery/data_set_manager/utils.py 76.1% <ø> (-0.07%) ⬇️
refinery/user_files_manager/tests.py 100% <ø> (ø) ⬆️
refinery/data_set_manager/models.py 80.89% <100%> (-0.19%) ⬇️
refinery/data_set_manager/search_indexes.py 75.55% <100%> (-16.4%) ⬇️
refinery/data_set_manager/tests.py 99.68% <100%> (-0.05%) ⬇️
refinery/galaxy_connector/models.py 26.92% <0%> (-73.08%) ⬇️
refinery/core/middleware.py 0% <0%> (-58.34%) ⬇️
refinery/file_store/tasks.py 34.57% <0%> (-5.11%) ⬇️
refinery/data_set_manager/views.py 40.96% <0%> (-4.53%) ⬇️
... and 56 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 896d06b...fdd17b7. Read the comment docs.

@mccalluc
Copy link
Member Author

mccalluc commented Nov 6, 2017

On second thought, perhaps not a config, but class constant. I think this is all that is needed now?

Do not merge for 1.6.1

@@ -71,6 +72,8 @@ def _assay_data(self, object):
# https://groups.google.com/forum/?fromgroups#!topic/django-haystack/g39QjTkN-Yg
# http://stackoverflow.com/questions/7399871/django-haystack-sort-results-by-title
def prepare(self, object):
if object.type not in Node.INDEXED_FILES:
raise SkipDocument()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can SkipDocument accept a message? Could be useful for debugging.

@mccalluc
Copy link
Member Author

mccalluc commented Nov 7, 2017

@ngehlenborg : Can you schedule time to give input on this, or delegate someone who can make a call? There are several fairly simple changes we could make that would fix the behavior, but we're not sure which is best.

@mccalluc
Copy link
Member Author

@scottx611x : Talked with Nils: Coming out of that, the idea was to show all documents, but for all the doc types we don't recognize, change the type, but also add a new field to the object so we could theoretically export the original.

With a little distance, that feels too complicated, and too fragile. How would you feel about merging this original PR, and putting the rest on the back burner?

@scottx611x
Copy link
Member

@mccalluc I agree on the fragility of: for all the doc types we don't recognize, change the type, but also add a new field to the object so we could theoretically export the original

Let me look over this PR again first and I'll give a better opinion

@scottx611x
Copy link
Member

@mccalluc I'd be okay with merging this as long as we move our future intentions into a new issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants