Raise an error when attempting to bulk-index without any corpora #568

danielmitterdorfer · 2018-09-06T05:29:06Z

source: https://discuss.elastic.co/t/no-throughput-metrics-available-for-bulk-likely-cause-the-benchmark-ended-already-during-warmup/147368/8

The root cause of the problem in the discussion above was that the user has used outdated track syntax and wondered why no documents have been bulk-indexed. We could make this more explicit by raising an error in the bulk parameter source if the list of corpora is empty.

bartier · 2020-04-13T02:21:24Z

Hi! Can I try work on this?

I made a test trying to run Rally like below but I removed the corpora definition for track 'percolator' in my local default repository to force this error:

./rally --track=percolator --challenge=append-no-conflicts --kill-running-processes --distribution-version 7.6.0

~/.rally/benchmarks/tracks/default/percolator/track.json

{% import "rally.helpers" as rally with context %}

{
  "version": 2,
  "description": "Percolator benchmark based on AOL queries",
  "indices": [
    {
      "name": "queries",
      "body": "index.json"
    }
  ],
  "operations": [
    {{ rally.collect(parts="operations/*.json") }}
  ],
  "challenges": [
    {{ rally.collect(parts="challenges/*.json") }}
  ]
}

I got the following error:

Is this a valid way to reproduce this error? If yes, I would propose your validation suggestion in TrackSpecificationReader#_create_corpora if the list of corpora is empty:

    def _create_corpora(self, corpora_specs, indices):
        if len(corpora_specs) == 0:
            raise exceptions.TrackConfigError(f"There is no document corpora definition for track {self.name}.")
        document_corpora = []
        known_corpora_names = set()
        ...
        ...

With the implementation above I got the TrackConfigError when trying to use a track without any corpora:

By the way, there is a None in the error message and I'm not sure if this is something that could be avoided to the user.

danielmitterdorfer · 2020-04-14T12:15:29Z

It is perfectly fine to define a track without a corpus (for example if you only want to run a query benchmark). I'd instead modify the constructor of BulkIndexParamSource. Here it determines which corpora should be used.

rally/esrally/track/params.py

Line 459 in cc2296b

self.corpora = self.used_corpora(track, params)

After that line I'd add a check whether that list is empty and if it is raise exceptions.InvalidSyntax.

By the way, there is a None in the error message and I'm not sure if this is something that could be avoided to the user.

Good point; this is likely due to the top-level error handler:

rally/esrally/rally.py

Lines 746 to 760 in cc2296b

    
           logging.getLogger(__name__).exception("Cannot run subcommand [%s].", sub_command) 
        
           msg = str(e.message) 
        
           nesting = 0 
        
           while hasattr(e, "cause") and e.cause: 
        
               nesting += 1 
        
               e = e.cause 
        
               if hasattr(e, "message"): 
        
                   msg += "\n%s%s" % ("\t" * nesting, e.message) 
        
               else: 
        
                   msg += "\n%s%s" % ("\t" * nesting, str(e)) 
        
           console.error("Cannot %s. %s" % (sub_command, msg)) 
        
           console.println("") 
        
           print_help_on_errors() 
        
           return False

I assume that this exception has no cause attached and we mistakenly extract None at some point. IMHO this should be solved separately from this issue here though.

danielmitterdorfer added enhancement Improves the status quo help wanted We'd be happy about a community contribution :Usability Makes Rally easier to use labels Sep 6, 2018

danielmitterdorfer added this to the 1.x milestone Sep 6, 2018

danielmitterdorfer added the good first issue Small, contained changes that are good for newcomers label Feb 19, 2020

bartier mentioned this issue Apr 17, 2020

Raise an error when attempting to bulk-index without any corpora #967

Merged

danielmitterdorfer closed this as completed in 137b4fc Apr 20, 2020

danielmitterdorfer modified the milestones: 2.x, 2.0.0 May 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Raise an error when attempting to bulk-index without any corpora #568

Raise an error when attempting to bulk-index without any corpora #568

danielmitterdorfer commented Sep 6, 2018

bartier commented Apr 13, 2020

danielmitterdorfer commented Apr 14, 2020

Raise an error when attempting to bulk-index without any corpora #568

Raise an error when attempting to bulk-index without any corpora #568

Comments

danielmitterdorfer commented Sep 6, 2018

bartier commented Apr 13, 2020

danielmitterdorfer commented Apr 14, 2020