-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accessing List within Table #5
Comments
Hi,
Le mar. 26 févr. 2019 à 02:31, s3nn <notifications@github.com> a écrit :
Hey,
First of all great project!
I just have a question regarding accessing lists (bullet points) within
tables. Basically, I have many documents that are comprised of tables that
have identical structure (rows / columns etc) and I'm trying to extract all
data from all tables from multiple documents in a meaningful way. The
problem is some cells contain bullet points (lists).
Is there any way to get all text in a table cell with one method?
From some testing, it appears if I use get_value / get_values it doesn't
return the text that is part of the List.
yes, .value will try to cast to a basic type python type (typically a sting
or a number)
However, I can use get_cells --> get_lists to extract the text, but I
would need to check for the presence of
here it depends of the nature of your documents:
- apparently you have some nested ODF elements into the cell, so the right
methods is to analyse it step by step. So making lines of code, using
get_cells --> get_lists and such. Note that get_lists should send back
None if no list.
- To retrieve text content at the risk of losing some structure, I would
try cell.text() cell.text_recursive() or cell.text_content() (i think the
later one is the more powerful).
any lists for each cell. Lastly, I could also use get_styled_elements for
each cell, but this might get tricky.
What would you recommend? Thank you in advance and keep up the excellent
work.
regards,
jd
… —
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#5>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACEFV8Xgc8dP3G9G7sN3-Rdc0pBPcX9tks5vRI5xgaJpZM4bRJo2>
.
--
Jérôme Dumonteil
|
Hey jd, Thanks for your recommendation, it looks like cell.text_content is what I'm looking for! Sincerely, |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hey,
First of all great project!
I just have a question regarding accessing lists (bullet points) within tables. Basically, I have many documents that are comprised of tables that have identical structure (rows / columns etc) and I'm trying to extract all data from all tables from multiple documents in a meaningful way. The problem is some cells contain bullet points (lists).
Is there any way to get all text in a table cell with one method?
From some testing, it appears if I use get_value / get_values it doesn't return the text that is part of the List. However, I can use get_cells --> get_lists to extract the text, but I would need to check for the presence of any lists for each cell. Lastly, I could also use get_styled_elements for each cell, but this might get tricky.
What would you recommend? Thank you in advance and keep up the excellent work.
The text was updated successfully, but these errors were encountered: