Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regex not working - options field too short #31

Closed
m1ch opened this issue Mar 18, 2020 · 3 comments
Closed

regex not working - options field too short #31

m1ch opened this issue Mar 18, 2020 · 3 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@m1ch
Copy link

m1ch commented Mar 18, 2020

The saveing of data source HTML grapper is not possible with the following combination. I tested the regex with no issues, but I cant save it. No error or warning given.

I suspect that the regex contains something creating the issue.

Datasource:
https://www.pollenwarndienst.at/prognose/3-tages-prognose.html?tx_scload_load%5Bzip%5D=8020

Regex:
/(<h3 class="polltitle">)(?<dimension>Erle)(\s.*\n){9,15}(.*<span class="invisible">\s*)(?<value>(Keine Belastung|niedrig|mittel|hoch|sehr hoch))(<\/span>)/

Result from https://pt.functions-online.com/preg_match_all.html:

  0 => 
  array (
    0 => '<h3 class="polltitle">Erle (Alnus)</h3>

                                        <div class="date">Dienstag, 17. März</div>
                                        

<div class="contaminationbar">
    
        <div class="number number_2">2</div>
        <div class="bar bar_2"></div>
    

    
        
        
        <span class="invisible"> mittel</span>',
  ),
  1 => 
  array (
    0 => '<h3 class="polltitle">',
  ),
  'dimension' => 
  array (
    0 => 'Erle',
  ),
  2 => 
  array (
    0 => 'Erle',
  ),
  3 => 
  array (
    0 => '        
',
  ),
  4 => 
  array (
    0 => '        <span class="invisible"> ',
  ),
  'value' => 
  array (
    0 => 'mittel',
  ),
  5 => 
  array (
    0 => 'mittel',
  ),
  6 => 
  array (
    0 => 'mittel',
  ),
  7 => 
  array (
    0 => '</span>',
  ),
)
@Rello
Copy link
Owner

Rello commented Mar 18, 2020

i will check.
Thank you for posting a real world example

@Rello
Copy link
Owner

Rello commented Mar 18, 2020

Hello,

2 things: first, there is an issue with DA. The field for the data load parameters is too short in the database (255). I need to fix this.

even with the fix locally, the debug of the pregmatch is still strange - even so it works in the online simulator

{\"0\":[],\"1\":[],\"dimension\":[],\"2\":[],\"3\":[],\"4\":[],\"value\":[],\"5\":[],\"6\":[],\"7\":[]}

I will see what I can find

@Rello Rello modified the milestone: 2.1.1 Mar 18, 2020
@Rello Rello self-assigned this Mar 18, 2020
@Rello Rello added bug Something isn't working in progress development in progress labels Mar 18, 2020
@Rello
Copy link
Owner

Rello commented Mar 21, 2020

Hello @m1ch
there seems to be an issue with the line brakes (possibly during php capturing?). when I modify the code a little, it works.

/(<h3 class="polltitle">)(?<dimension>Erle)([\S\s]+)(<span class="invisible">\s*)(?<value>(Keine Belastung|niedrig|mittel|hoch|sehr hoch))(<\/span>)/

BUT:
your value is a text. DA expects a value here.
Can you please let me know how you want to use this information in a report?

  • a table with the current values?
  • a table with the history?
  • a chart?

working with the <div class="number number_2">2</div> could bring a value.
let me know what you want to achieve

@Rello Rello changed the title Datasource: website grabber - Testet regex not working regex not working - options field too short Mar 21, 2020
@Rello Rello added this to the 2.1.2 milestone Mar 21, 2020
@Rello Rello added testing development finished; in testing and removed in progress development in progress testing development finished; in testing labels Mar 31, 2020
@Rello Rello closed this as completed Apr 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants