Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some thing wrong with regex.search #34

Open
jxxcr opened this issue Aug 31, 2023 · 8 comments
Open

some thing wrong with regex.search #34

jxxcr opened this issue Aug 31, 2023 · 8 comments

Comments

@jxxcr
Copy link

jxxcr commented Aug 31, 2023

i just run the template at readme.md, and regex.search is not work.
Version:
python 3.11
regex 2023.8.8
click 8.1.3
numpy 1.24.3

TypeError                                 Traceback (most recent call last)
Cell In[8], line 2
      1 with open('out.out', 'r') as f:
----> 2     for match in parse_iter(str(f.read())):
      3         print(match.values)
      4     #print(f.read())

File C:\Program Files\Python311\Lib\site-packages\cp2k_output_tools\parser.py:33, in parse_iter(content, matchers, key_mangling)
     31 def parse_iter(content: str, matchers: List[Callable] = builtin_matchers, key_mangling: bool = False) -> Iterator[Dict[str, Any]]:
     32     """Yields the structured data found in the block matches"""
---> 33     for block in parse_iter_blocks(content, matchers, key_mangling):
     34         yield block.data

File C:\Program Files\Python311\Lib\site-packages\cp2k_output_tools\parser.py:22, in parse_iter_blocks(content, matchers, key_mangling)
     20 """Yield BlockMatch objects containing both structured data and metadata for each found match"""
     21 for matcher in matchers:
---> 22     match = matcher(content)
     24     if match:
     25         if key_mangling:

File C:\Program Files\Python311\Lib\site-packages\cp2k_output_tools\blocks\program_info.py:75, in match_program_info(content, start, end, as_tree_obj)
     71 def match_program_info(
     72     content: str, start: int = 0, end: int = sys.maxsize, as_tree_obj: bool = False
     73 ) -> Optional[Union[BlockMatch, ProgramInfo]]:
     74     spans = []
---> 75     match = PROGRAM_INFO_START_RE.search(content, start, end)
     77     if not match:
     78         return None

TypeError: string indices must be integers
@dev-zero
Copy link
Contributor

dev-zero commented Dec 9, 2023

sorry for the late reply: is there a chance you could attach the file you are trying to parse? The first ~20 lines should be sufficient I think.

@dev-zero
Copy link
Contributor

In case you did also use the file to parse from the README.md, I have the suspicion there is something different for Python on Windows. Can you try to add the following line after line 74 in the C:\Program Files\Python311\Lib\site-packages\cp2k_output_tools\blocks\program_info.py file, try again and send me the output of that line:

    print("end:", end, type(end))

@dev-zero
Copy link
Contributor

Or run this:

import sys
print("maxsize:", sys.maxsize, type(sys.maxsize))
print("hello, there"[0:sys.maxsize])

@jxxcr
Copy link
Author

jxxcr commented Dec 10, 2023 via email

@dev-zero
Copy link
Contributor

Nevermind, this already helps. My suspicion was that for some reason sys.maxsize is different on Windows, but it does not seem to be the case.

Could you try the following as well?

import sys
import regex
print(regex.compile(r"b+").search("aaaabbbbbbcccc", 0, sys.maxsize))

@jxxcr
Copy link
Author

jxxcr commented Dec 10, 2023 via email

@dev-zero
Copy link
Contributor

dev-zero commented Dec 10, 2023

And does it work with this?

import sys
import re
print(re.compile(r"b+").search("aaaabbbbbbcccc", 0, sys.maxsize))

I am trying to figure out whether it's a general issue with the re package, or a bug in the regex one.

@jxxcr
Copy link
Author

jxxcr commented Dec 11, 2023

I found the template in README.md is working normally in Linux. And the tools is also not commonly in Windows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants