Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Invalid XML encountered": control characters in XML #626

Closed
epilys opened this issue Apr 26, 2017 · 4 comments
Closed

"Invalid XML encountered": control characters in XML #626

epilys opened this issue Apr 26, 2017 · 4 comments

Comments

@epilys
Copy link

epilys commented Apr 26, 2017

  • vdirsyncer version: 0.15.0
  • server: owncloud 8.2.2 (has sabre 2.1.7)
  • Python version: 3.6
  • Your operating system: openbsd 6.1
  • Your config file: not applicable

So I got this error:
error: The server returned something vdirsyncer doesn't understand. Error message: InvalidXMLResponse('Invalid XML encountered: not well-formed (invalid token): line 430, column 21\nDouble-check the URLs in your config.',)

Running --verbosity=DEBUG and looking at this line in the returned XML, I find this (censored) event:

<cal:calendar-data>BEGIN:VCALENDAR 
debug: VERSION:2.0                                                                             
debug: PRODID:ownCloud Calendar                                                                
 debug: CALSCALE:GREGORIAN                                                                      
 debug: BEGIN:VEVENT                                                                            
 debug: UID:<censored>                                                                          
 debug: DTSTAMP:<censored>                                                              
 debug: CREATED:<censored>                                                                     
 debug: LAST-MODIFIED:<censored>                                                               
 debug: SUMMARY:<censored>                                                            
 debug: DTSTART;TZID=<censored>                                                   
 debug: DTEND;TZID=<censored>                                                     
 debug: LOCATION:                                                                               
 debug: DESCRIPTION:<censored>^D<censored>                                  
 debug: CATEGORIES:                                                                             
 debug: END:VEVENT                                                                              
 debug: END:VCALENDAR                                                                           
 debug: </cal:calendar-data>

A stray ^D char in the DESCRIPTION field (how did it get there?). is returned. Other clients (owncloud web client, thunderbird lightning) don't throw an error at this.

Running sync again...:
error: The server returned something vdirsyncer doesn't understand. Error message: InvalidXMLResponse('Invalid XML encountered: not well-formed (invalid token): line 1189, column 21\nDouble-check the URLs in your config.',)

Now what?

...
1189: debug: SUMMARY:<censored>^O
...

Ugh! And not only 1189, but a lot of other events had stray ^Ds and ^Os. Turns out copying text from pdfs leaks control characters in your calendar, who knew. Removing those characters from the events allowed me to finally sync.

@untitaker
Copy link
Member

Let's remove ASCII control characters before parsing XML. This change has to happen in _parse_xml.

@untitaker
Copy link
Member

@epilys could you figure out where that control character came from?

@untitaker untitaker added planning and removed ready labels May 11, 2017
@epilys
Copy link
Author

epilys commented May 11, 2017

@untitaker copy pasted from a pdf.

@untitaker
Copy link
Member

untitaker commented May 11, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants