Skip to content

fix parse_history regex when sequence exeeds 9999#33

Merged
dotslash merged 1 commit intodotslash:masterfrom
awaxa:fix-parse-history-regex
Jul 15, 2020
Merged

fix parse_history regex when sequence exeeds 9999#33
dotslash merged 1 commit intodotslash:masterfrom
awaxa:fix-parse-history-regex

Conversation

@awaxa
Copy link
Copy Markdown
Contributor

@awaxa awaxa commented Jun 30, 2020

Match zero or more characters of whitespace at the beginning of lines when parsing history 1 output.

This fixed my issue where the parse_history function stopped working once my history reached 10000 lines.

Match zero or more characters of whitespace at the beginning of lines
when parsing `history 1` output.
@dotslash
Copy link
Copy Markdown
Owner

Hi @awaxa ,
I did not understand how this fixed your issue. Can you explain what happens when history exceeds your configured limit ?

@dotslash dotslash self-requested a review July 14, 2020 05:30
@dotslash
Copy link
Copy Markdown
Owner

Remove the needs-info label after you clarify :)

@awaxa
Copy link
Copy Markdown
Contributor Author

awaxa commented Jul 15, 2020

Note: I use export HISTFILESIZE= to set no limit to the size of my .bash_history file, and I would not have encountered this issue while using a limit less than 10000.

When my .bash_history file reached 10000 lines and my shell would evaluate the PARSE_COMMAND, I started seeing the parse error warning message. The if statement in the log function would evaluate to False because the re.match in the parse_history function had stopped returning its capture groups. Since the value of the history argument for the parse_history function is provided by the output from a subshell in the PROMPT_COMMAND, I noticed that the format of this output had changed with respect to the regular expression in parse_history.

The command HISTTIMEFORMAT= history 1 seems to always print the line number before the content of the line from .bash_history, and was now printing lines like 10011 date

Here's a sample of .bash_history lines 9998-10001 as formatted by the history command, from HISTTIMEFORMAT= history | sed -n 9998,10001p

 9998  ls
 9999  cd -
10000  cd ..
10001  ls

Notice that there is a space at beginning of the history output until line number 10000, and after that the output begins with the line number with no spaces. The expression in the re.match statement in parse_history was defined as ^\s+(\d+)\s+(.*)$ and will only match if the output begins with one or more whitespace characters, but not if the output begins with a digit.

This commit changes the first + in this regular expression to a *. This modifies its behavior from matching one or more whitespace characters at the beginning of a line of history output to matching zero or more whitespace characters.

With this fix, I am able to continue parsing history for .recent.db even when my .bash_history file is over 9999 lines in length.

@dotslash dotslash merged commit acacb9e into dotslash:master Jul 15, 2020
@dotslash
Copy link
Copy Markdown
Owner

Thanks for the super detailed explanation :)

I have a few other changes in mind in the coming few days. I will batch push pypi after that.

@dotslash dotslash removed their request for review July 15, 2020 08:17
dotslash added a commit that referenced this pull request Jul 16, 2020
Also add test for #33
dotslash added a commit that referenced this pull request Jul 16, 2020
Fix incorrect number of bindings supplied

- fix #31
- add test for #33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants