Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestions for handling escape and csi characters #1888

Closed
wants to merge 42 commits into from

Conversation

espenfl
Copy link
Contributor

@espenfl espenfl commented Aug 15, 2018

Here follows a simple suggestion on how to strip escape and csi characters based on #1887. We do not want these characters as they upset downstream functionality. Currently unsure if we should also do this for the SshTransport. List needs to be updated with future issues or already known characters we want to eject.

@espenfl
Copy link
Contributor Author

espenfl commented Aug 28, 2018

I get too many lines here for the LocalTransport. Should we fix or ignore?

@espenfl espenfl closed this Aug 28, 2018
@espenfl espenfl reopened this Aug 28, 2018
@@ -49,6 +48,10 @@ class LocalTransport(Transport):
# So I set the (default) limit to zero.
_DEFAULT_SAFE_OPEN_INTERVAL = 2.

# List of escape and csi characters we forcefully remove from stdout
# and stderr
_EC_CHARACTERS = ['\x1b[H\x1b[J']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aren't ANSI formatting sequences actual dual or multibyte sequences and not single characters?
So, something like:

_EC_CHARACTERS = [b'\x1b[H', ... ] # use bytes since those are ANSI, not unicode sequences

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this indeed needs to be fixed, will update this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

if input_string:
for entry in self._EC_CHARACTERS:
input_string = input_string.strip(entry)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and then here:

for entry in self._EC_CHARACTERS:
    input_string = input_string.replace(entry, b'')  # remove byte sequences, not single bytes

might be worthwhile to use something like colorama to get all the ANSI formatting sequences.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was unaware of that library. Seems to yield what we need (I know nothing of potential side effects at this point).

Copy link
Contributor

@dev-zero dev-zero Sep 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but it might be better to use the ansifilter library directly instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@coveralls
Copy link

coveralls commented Sep 17, 2018

Pull Request Test Coverage Report for Build 3828

  • 8 of 8 (100.0%) changed or added relevant lines in 1 file are covered.
  • 3 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+0.04%) to 68.112%

Files with Coverage Reduction New Missed Lines %
aiida/orm/implementation/sqlalchemy/group.py 1 87.76%
aiida/orm/implementation/general/calculation/init.py 2 79.74%
Totals Coverage Status
Change from base Build 3825: 0.04%
Covered Lines: 24218
Relevant Lines: 35556

💛 - Coveralls

@coveralls
Copy link

coveralls commented Sep 27, 2018

Coverage Status

Coverage decreased (-2.1%) to 65.591% when pulling d4c49b7 on espenfl:escape_handler_suggestions into ebf1644 on aiidateam:develop.

@giovannipizzi
Copy link
Member

Hi Espen,
I have the gut feeling that this PR might introduce worse bugs.
E.g., if you call a command to dump the content of a (binary) file, and the file contains that specific sequence of bytes, you end up with a corrupted file.

Would you be able to understand which string contains those characters? E.g. changing the tests to print the output_text (if those characters are found), to understand who is generating them?

Once this is clear, we can understand what is the correct approach for fixing this. It might be a problem upstream (e.g. you have something set in your terminal to produce escape characters).

Also, it would be useful if you could paste the output of a failed test, to see where the error shows up and again help in finding a more robust solution.

Thanks!

Copy link
Member

@giovannipizzi giovannipizzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my previous comment

@espenfl
Copy link
Contributor Author

espenfl commented Oct 25, 2018

See also discussion #1887. Yes and no. I think we have to take a stand on the question of do we expect (or require) the users to run a clean system, e.g. no special colors or terminal setups. Once the answer to this question is settled, we can either disregard these issues (and rely on each users that have customs setups to fix it) or we have to find a more general solution. The characters come from the terminal and appear due to different user configs. The issue is currently only with the direct scheduler, since that analyses the output from the commands in a rather brute force manner. In addition, there is problems with the clear command. See for instance #1890. Should we allow the user (or distro) to modify clean terminal behavior? My personal preference would be; yes. But that does not mean I have a good solution to this problem.

@giovannipizzi
Copy link
Member

Hi @espenfl - so, for the issue of #1890, you can check my comment there - my position is that (since also e.g. sftp would break for instance) it's up to the user not to produce output or do other strange things in non-interactive shells - I put the solution in the issue and I think we should just document it.

That, however, is only for SSH (I think). For #1887 and this PR, instead, apparently this happens for the local transport. Do you know, for the people who have this issue, if the guard I propose in #1890 would solve also this case? (If this the case, documenting it would be enough).

@giovannipizzi
Copy link
Member

I'm closing this, in favour of a different solution (#2132)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants