New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trying to use unsupported "(?-i)" in a Python regex will result in an error #562
Comments
are you sure that you dont need to use '' for your expressions? something like |
Thanks for your input. I gave that a shot and received errors. Just in case my job has incorrect syntax here it is. I am wondering if putting
I receive this error
examples of failed jobs
examples of successful jobs
There is now a urlwatch subreddit |
For me, this parses successfully. Are you sure you are using Python 3 and all the latest packages and dependencies? Also, this line:
Maybe try:
|
Are you sure there are no weird invisible bytes in your file? Maybe attach it to this thread, so we can have a look? Also, looking at the file in a hex editor or an editor that shows all hidden bytes could be useful? |
As soon as im using the "grep"-line i get the same error. Maybe using shellpipe instead helps? Someting like this? |
@jprokos any news on that front? |
I am still looking into it. I began by: This job failed:
These jobs are working:
|
You cannot use |
EDIT:
Thanks, I removed that from my jobs. I am still having inconsistent behavior when combining I've checked my job file and don't see anything odd. I installed a hex editor but not familiar with how to use it to spot issues in this file. The odd thing being it seems to be job dependent. The format works for other jobs - see job 22 below. No longer finding anything:
returns nothing Changing to the following works...
The
returns
|
I looked at my python install and can't see anything wrong. I still have inconsistencies. I am on macOS which doesn't use GNU command line tools - would this matter? What should the format be for the grep statement - quotes, single quotes and pipes? another example where the pipe seems to be messing up my job. This will not return the second item in the xpath:
This does return data for "pubDate" and the grep argument works as entered.
|
Try the following in an interactive Python shell: >>> import yaml
>>> import pprint
>>> pprint.pprint(yaml.load(open('/path/to/your/urls.yaml'), Loader=yaml.SafeLoader)) Where The grep filter has nothing to do with the return '\n'.join(line for line in data.splitlines()
if re.search(subfilter['re'], line) is not None) Nowhere do the docs say that it uses the Again, the issue you have been seeing is with The other things that you were mentioning (which should be a different issue, this isn't a support forum thread) might be a limitation of Python's xpath support: https://stackoverflow.com/a/22560964/1047040 |
Sorry for long delay. This is incorrect:
That is no longer in my job and I still have the same issues. I wanted to correct that misconception before I continue to troubleshoot. Thanks. |
@thp Thanks for continuing to troubleshoot this. I did exactly as you asked and ran into an error. I do have multiple "jobs" separated by
Hope that is helpful. |
That seems to work except when Do all |
The jobs should be separated with Anyway, try this instead: >>> import yaml
>>> import pprint
>>> pprint.pprint(yaml.load_all(open('/path/to/your/urls.yaml'), Loader=yaml.SafeLoader)) |
Yes. |
I'm closing this now, as the original issue (using |
|
Ok, try this then: >>> import yaml
>>> import pprint
>>> pprint.pprint(list(yaml.load_all(open('/path/to/your/urls.yaml'), Loader=yaml.SafeLoader))) |
|
Cool! Maybe remove the last "---" in the file, so the |
Thanks for looking over the file. Appreciate you. |
In previous versions I was able to use double quotes to grep. When using the grep case insensitive flag (?i) my jobs fail if the search terms are in single or double quotes.
When searching for two terms using the pipe character how can I specify one term as being case insensitive and the next term as case sensitive? Like this...
grep: (?i)"National Age Group Record"|(?-i)"NAG"
When I don't quote NAG or specify it as case sensitive I end up with matches I don't want i.e. Snags
The text was updated successfully, but these errors were encountered: