You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
a suggested code or documentation change, improvement to the code, or feature request
As you can probably tell with all the html-related activity, I'm trying to load a complex HTML document (with 589 tables in it) that is stretching rio's abilities.
The newest issue will come to a question of how you'd like to handle elements with more complex HTML structure within the <td> element. The example here is one where there are two paragraphs (<p>) inside both some of the header and some of the data parts.
How would you like to handle that? A simple option could be to paste them together with a user-defined separator. A more complex option could be to allow the user to provide the function. I'm sure that there are more options than that.
Changing this would be slightly backward incompatible because if unlist() resulted in an equal number of values previously, those would have been spread across columns. Handling it so that it will go into a single cell would change that functionality, but it seems more consistent with the underlying table, so the new functionality seems preferable to me.
@billdenney Thank you very much for reporting this. I would argue that the HTML functionalities are more for exporting than importing. There is no standard HTML table and therefore, as you reported, it is easy to break the html import function.
As I explained in #307 , breaking changes should be avoided. Also, as there is no standard, any solution to this is prone to break. I think a fair approach to handle this is to explain in the documentation that the html import functionality is not robust. For complex html tables, one should write ones own solution with xml2 or rvest.
Please specify whether your issue is about:
As you can probably tell with all the html-related activity, I'm trying to load a complex HTML document (with 589 tables in it) that is stretching rio's abilities.
The newest issue will come to a question of how you'd like to handle elements with more complex HTML structure within the
<td>
element. The example here is one where there are two paragraphs (<p>
) inside both some of the header and some of the data parts.How would you like to handle that? A simple option could be to paste them together with a user-defined separator. A more complex option could be to allow the user to provide the function. I'm sure that there are more options than that.
Changing this would be slightly backward incompatible because if
unlist()
resulted in an equal number of values previously, those would have been spread across columns. Handling it so that it will go into a single cell would change that functionality, but it seems more consistent with the underlying table, so the new functionality seems preferable to me.Here is an example file: multi-elements-under-td.zip
Sessioninfo is still the same as the last few issues.
The text was updated successfully, but these errors were encountered: