New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config parsing breaks with `ignore_blank_lines=True` (object parent assignments are missed) #13

Closed
CrackerJackMack opened this Issue Jan 23, 2015 · 9 comments

Comments

Projects
None yet
2 participants
@CrackerJackMack

CrackerJackMack commented Jan 23, 2015

I was struggling with an issue for a good while today why the first line of every section of my config was being truncated. Turns out it's really a fatal flaw in many of the routines. In many places there is for idx, obj in enumerate(self._list): where it is expected that idx == obj.linenum. Especially during __init__. I say this because many functions call iter_with_comments or iter_no_comments and pass in obj.linenum as the the begin_index parameter. Thus creating the relationship between list index and the config line number.

I'm submitting this as an issue instead of a pull request directly since I'm not sure of the intended use of the iter_*_comments functions (they are public api). So it is a conundrum on fixing this issue. Wasn't sure the best way to go about fixing it, but am willing to do so with some guidance on the matter.

@CrackerJackMack

This comment has been minimized.

Show comment
Hide comment
@CrackerJackMack

CrackerJackMack Jan 23, 2015

_reassign_linenums also using the enumeration method to re-assign line numbers. I guess it could be said that after the initial parsing is done that you could call _reassign_linenums but that seem bruteforce-ish. Would seem more logical to handle that in the insert/append functions?

CrackerJackMack commented Jan 23, 2015

_reassign_linenums also using the enumeration method to re-assign line numbers. I guess it could be said that after the initial parsing is done that you could call _reassign_linenums but that seem bruteforce-ish. Would seem more logical to handle that in the insert/append functions?

@CrackerJackMack

This comment has been minimized.

Show comment
Hide comment
@CrackerJackMack

CrackerJackMack Jan 23, 2015

This seems to fix it for my use case (without changing the default) but could be breaking a public API. /me shrugs

     def iter_with_comments(self, begin_index=0):
-        for idx, obj in enumerate(self._list):
-            if (idx>=begin_index):
+        for obj in self._list:
+            if obj.linenum >= begin_index:
                 yield obj

     def iter_no_comments(self, begin_index=0):
-        for idx, obj in enumerate(self._list):
-            if (idx>=begin_index) and (not obj.is_comment):
+        for obj in self._list:
+            if obj.linenum >= begin_index and not obj.is_comment:
                 yield obj

CrackerJackMack commented Jan 23, 2015

This seems to fix it for my use case (without changing the default) but could be breaking a public API. /me shrugs

     def iter_with_comments(self, begin_index=0):
-        for idx, obj in enumerate(self._list):
-            if (idx>=begin_index):
+        for obj in self._list:
+            if obj.linenum >= begin_index:
                 yield obj

     def iter_no_comments(self, begin_index=0):
-        for idx, obj in enumerate(self._list):
-            if (idx>=begin_index) and (not obj.is_comment):
+        for obj in self._list:
+            if obj.linenum >= begin_index and not obj.is_comment:
                 yield obj
@mpenning

This comment has been minimized.

Show comment
Hide comment
@mpenning

mpenning Jan 23, 2015

Owner

Hi Kevin, please add an example config which shows how status quo is broken; perhaps the best way to do this is as a python triple-quoted string. Also include how you are parsing (i.e. with the factory option)?

Owner

mpenning commented Jan 23, 2015

Hi Kevin, please add an example config which shows how status quo is broken; perhaps the best way to do this is as a python triple-quoted string. Also include how you are parsing (i.e. with the factory option)?

@mpenning mpenning self-assigned this Jan 23, 2015

@mpenning mpenning added the question label Jan 23, 2015

@CrackerJackMack

This comment has been minimized.

Show comment
Hide comment
@CrackerJackMack

CrackerJackMack Jan 23, 2015

I was using all defaults on ciscoconfparse==1.1.23 passing in the filename as config.

I'm trying to sanitize a reproducible config, this one is 4101 lines long (wc -l)

CrackerJackMack commented Jan 23, 2015

I was using all defaults on ciscoconfparse==1.1.23 passing in the filename as config.

I'm trying to sanitize a reproducible config, this one is 4101 lines long (wc -l)

@CrackerJackMack

This comment has been minimized.

Show comment
Hide comment
@CrackerJackMack

CrackerJackMack Jan 23, 2015

https://gist.github.com/CrackerJackMack/409b6b2f9e0f0caf0ced

make sure you download the raw as those spaces can be gobbled up in copy and paste

CrackerJackMack commented Jan 23, 2015

https://gist.github.com/CrackerJackMack/409b6b2f9e0f0caf0ced

make sure you download the raw as those spaces can be gobbled up in copy and paste

@mpenning mpenning added bug and removed question labels Jan 23, 2015

@mpenning mpenning changed the title from ignore_blank_lines should probably be False to Config parsing breaks with `ignore_blank_lines=True` (object parent assignments are missed) Jan 23, 2015

@mpenning

This comment has been minimized.

Show comment
Hide comment
@mpenning

mpenning Jan 23, 2015

Owner

Great bug report; thank you. This problem was introduced when I re-wrote the library in 0.9.x... then, I added the explicit ignore_blank_lines option to resolve the concern in PR #3 (but the problem already seemed to exist at that time). BTW, your gist mentioned that loopback1's ipv6 address is missed; I'm not sure if you realized it, but loopback2's ipv4 address is broken too.

Your proposed modifications to iter_with_comments() and iter_no_comments() passes unit tests, but I'd like to take some time to consider the best way to fix this. The actual problem is that child assignments to parent objects break in the config you provided; specifically the second child never gets assigned to the parent, even if the first child is a comment. The problem seems to be triggered if there are two blank lines before a parent line.

I'm including a minimal, complete example to reproduce the problem in this bug report...

import os
from ciscoconfparse import CiscoConfParse

config = """!
!
router ospf
 area 0 
!
router isis
 net 10.10.10.10.10
 address-family ipv4 unicast
  default-metric 10
 exit-address-family


!
interface loopback 1
 ip address 192.168.0.1/32
 ipv6 address fe80::30/128
 ipv6 enable
 ipv6 router isis
 !! BUG The "ipv6 address" line never has a parent assigned to it.
 !! This problem happens for every 2nd child of a parent, if the config has two balnk
 !! lines before the parent
!
interface loopback 2
 port-name brocade config example
 ip address 192.168.1.1/32
 !! BUG The "ip address" line never has a parent assigned to it.
!
!
end
"""

cfg_list = config.split(os.linesep)

broken = CiscoConfParse(cfg_list, ignore_blank_lines=True)
for obj in broken.ConfigObjs:
    print obj
Owner

mpenning commented Jan 23, 2015

Great bug report; thank you. This problem was introduced when I re-wrote the library in 0.9.x... then, I added the explicit ignore_blank_lines option to resolve the concern in PR #3 (but the problem already seemed to exist at that time). BTW, your gist mentioned that loopback1's ipv6 address is missed; I'm not sure if you realized it, but loopback2's ipv4 address is broken too.

Your proposed modifications to iter_with_comments() and iter_no_comments() passes unit tests, but I'd like to take some time to consider the best way to fix this. The actual problem is that child assignments to parent objects break in the config you provided; specifically the second child never gets assigned to the parent, even if the first child is a comment. The problem seems to be triggered if there are two blank lines before a parent line.

I'm including a minimal, complete example to reproduce the problem in this bug report...

import os
from ciscoconfparse import CiscoConfParse

config = """!
!
router ospf
 area 0 
!
router isis
 net 10.10.10.10.10
 address-family ipv4 unicast
  default-metric 10
 exit-address-family


!
interface loopback 1
 ip address 192.168.0.1/32
 ipv6 address fe80::30/128
 ipv6 enable
 ipv6 router isis
 !! BUG The "ipv6 address" line never has a parent assigned to it.
 !! This problem happens for every 2nd child of a parent, if the config has two balnk
 !! lines before the parent
!
interface loopback 2
 port-name brocade config example
 ip address 192.168.1.1/32
 !! BUG The "ip address" line never has a parent assigned to it.
!
!
end
"""

cfg_list = config.split(os.linesep)

broken = CiscoConfParse(cfg_list, ignore_blank_lines=True)
for obj in broken.ConfigObjs:
    print obj

mpenning added a commit that referenced this issue Jan 24, 2015

@mpenning

This comment has been minimized.

Show comment
Hide comment
@mpenning

mpenning Jan 24, 2015

Owner

I just pushed 1.1.24, which should fix the problem. Can you test on your side and confirm?

Owner

mpenning commented Jan 24, 2015

I just pushed 1.1.24, which should fix the problem. Can you test on your side and confirm?

@mpenning

This comment has been minimized.

Show comment
Hide comment
@mpenning

mpenning Jan 24, 2015

Owner

FYI... I also pushed 1.2.0 this morning, which includes optimizations giving me between 25% and 24,000% parsing speed improvement (depending on python version and the type of configs)... if you have Cisco ASAs, it goes into beast-mode when you parse as CiscoConfParse('/path/to/config', syntax='asa', factory=True)... reduced a 20k-line ASA config parse from two minutes, forty seconds in 1.1.24, to 0.65 second in 1.2.0.

Owner

mpenning commented Jan 24, 2015

FYI... I also pushed 1.2.0 this morning, which includes optimizations giving me between 25% and 24,000% parsing speed improvement (depending on python version and the type of configs)... if you have Cisco ASAs, it goes into beast-mode when you parse as CiscoConfParse('/path/to/config', syntax='asa', factory=True)... reduced a 20k-line ASA config parse from two minutes, forty seconds in 1.1.24, to 0.65 second in 1.2.0.

@CrackerJackMack

This comment has been minimized.

Show comment
Hide comment
@CrackerJackMack

CrackerJackMack Jan 24, 2015

I updated to 1.2.1 as you asked and removed all but the config filename. Noticed not only the huge speedups but my original issue has been fixed as well. Thanks for working on this! This project has really been a life saver.

CrackerJackMack commented Jan 24, 2015

I updated to 1.2.1 as you asked and removed all but the config filename. Noticed not only the huge speedups but my original issue has been fixed as well. Thanks for working on this! This project has really been a life saver.

@mpenning mpenning closed this Jan 24, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment