New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Converting tables to “running text” #44
Comments
this can easily be done: from inscriptis import get_text
content = open("/tmp/t.html").read()
text = get_text(content)
for line in text.split('\n'):
if not line.strip():
continue
product, size, price, location, comment = line.strip().split(maxsplit=4)
print(f'Product: {product} Size: {size} Price: {price} Location: {location} Comment:{comment}') |
@AlbertWeichselbraun thank you. However, this is was just an example. what I was looking for is a generic way to set a rule or callback on every table found by inscriptis. is it possible? |
For example, consider the following table:
the running text if the first row should be: |
my recommendation would be to use
you could then automatically extract tables, and split them into columns based on the tabulator. |
I have a table, for example:
That inscriptis transform into:
Which is great. However, I want to convert it into running text:
That means for each row, add the column name before the value.
This can be done using inscriptis?
The text was updated successfully, but these errors were encountered: