Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sarra winnow option sum n bug #377

Closed
junhu3 opened this issue Jul 13, 2021 · 5 comments
Closed

sarra winnow option sum n bug #377

junhu3 opened this issue Jul 13, 2021 · 5 comments
Labels
bug Something isn't working likely-fixed likely fix is in the repository, success not confirmed yet. Priority 1 - ASAP Absolute Priority... really need to do this.

Comments

@junhu3
Copy link

junhu3 commented Jul 13, 2021

When using the option sum n in the messages, we only want winnow to publish the first one coming from the source if the name of the files are the same. But the problem is that if the content is different, winnow will publish all messages even the files have the same name.
Ex:
2021-06-08 19:47:04,444 [INFO] post_log notice=20210608194659.733278513 sftp://xxx@xxx/ /tmp/test3 headers={'mtime': '20210608194653.442133427', 'to_clusters': 'xxx', 'mode': '664', 'source': 'feeder', 'atime': '20210608194653.442133427', 'parts': '1,12,1,0,0', 'sum': 'n,8ad8757baa8564dc136c1e07507f4a98', 'from_cluster': 'xxx'}
...
2021-06-08 19:47:04,449 [DEBUG] sr_config set_sumalgo n
2021-06-08 19:47:04,449 [DEBUG] notice 20210608194703.711236954 sftp://xxx@xxx/ /tmp/test3
2021-06-08 19:47:04,449 [DEBUG] urlstr sftp://xxx@xxx//tmp/test3
2021-06-08 19:47:04,449 [DEBUG] Received notice 20210608194703.711236954 sftp://xxx@xxx//tmp/test3
...
2021-06-08 19:47:04,450 [DEBUG] sr_cache check basis=path
2021-06-08 19:47:04,450 [DEBUG] sum already in cache: key=n,8ad8757baa8564dc136c1e07507f4a98
2021-06-08 19:47:04,451 [DEBUG] added value=/tmp/test3*1,5,1,0,0
2021-06-08 19:47:04,451 [DEBUG] new entry, not a part: part=1,5,1,0,0
...
2021-06-08 19:47:04,451 [DEBUG] sr_winnow on_post
2021-06-08 19:47:04,451 [INFO] post_log notice=20210608194703.711236954 sftp://xxx@xxx/ /tmp/test3 headers={'mtime': '20210608194641.602636099', 'to_clusters': 'xxx', 'mode': '644', 'source': 'feeder', 'atime': '20210608194641.602636099', 'parts': '1,5,1,0,0', 'sum': 'n,8ad8757baa8564dc136c1e07507f4a98', 'from_cluster': 'xxx'}

That's the related code in sr_cache.py:
self.logger.debug("sum already in cache: key={}".format(key))
kdict = self.cache_dict[key]
present = value in kdict
kdict[value] = now

    # differ or newer, write to file
    self.fp.write("%s %f %s %s\n"%(key,now,qpath,part))
    self.count += 1

    if present:
       self.logger.debug("updated time of old entry: value={}".format(value))
       self.cache_hit = value
       return False
    else:
       self.logger.debug("added value={}".format(value))

    if part is None or part[0] not in "pi":
       self.logger.debug("new entry, not a part: part={}".format(part))
       return True

Adding the following if condition could solve the problem:

    self.logger.debug("sum already in cache: key={}".format(key))
    if self.cache_basis is 'path': return False
    kdict   = self.cache_dict[key]
@petersilva
Copy link
Contributor

Thanks jun, please:

then it can be easily analyzed.

@petersilva
Copy link
Contributor

OK, I tried it... first problem 'is' is a syntax error, replaced with ==.
When I run the flow tests, sr_poll fails... dunno why... so it isn't a clean fix, but seems to break something else.

@petersilva
Copy link
Contributor

in order for the checksum algorithm to work with partitioned files, it should not prevent publication of parts of a file when the partition size is different. However partitioned files are very rarely used currently... for the case of a file published in a single part, the whole point of 'n' as a checksum choice is to avoid considering data changes... which, when there are no partitions, will result in the blocksize being different in the 'partflg'... hmm...

options:

  • disable partitioning entirely... kind of drastic, as this problem only affects Name (-n) checksum.
  • modify the partflag to be just '1' when the entire file is written (no partitioning.)
  • something else.?

not sure..

petersilva added a commit that referenced this issue Aug 6, 2021
@petersilva
Copy link
Contributor

OK have a fix for that now... (for v2... discovered v3 is very messed up... adding that tag here to track fixing that, but v2 should be OK now.)

@petersilva petersilva added bug Something isn't working likely-fixed likely fix is in the repository, success not confirmed yet. Priority 1 - ASAP Absolute Priority... really need to do this. labels Aug 9, 2021
@petersilva
Copy link
Contributor

v3 fix is still outstanding.

petersilva added a commit that referenced this issue Aug 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working likely-fixed likely fix is in the repository, success not confirmed yet. Priority 1 - ASAP Absolute Priority... really need to do this.
Projects
None yet
Development

No branches or pull requests

2 participants