Skip to content

Conversation

@dependabot
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Dec 10, 2024

Bumps unstructured from 0.10.27 to 0.16.11.

Release notes

Sourced from unstructured's releases.

0.16.11

Enhancements

  • Enhance quote standardization tests with additional Unicode scenarios
  • Relax table segregation rule in chunking. Previously a Table element was always segregated into its own pre-chunk such that the Table appeared alone in a chunk or was split into multiple TableChunk elements, but never combined with Text-subtype elements. Allow table elements to be combined with other elements in the same chunk when space allows.
  • Compute chunk length based solely on element.text. Previously .metadata.text_as_html was also considered and since it is always longer that the text (due to HTML tag overhead) it was the effective length criterion. Remove text-as-html from the length calculation such that text-length is the sole criterion for sizing a chunk.

Features

Fixes

  • Fix ipv4 regex to correctly include up to three digit octets.

0.16.10

Enhancements

Features

Fixes

  • Fix original file doctype detection from cct converted file paths for metrics calculation.

0.16.9

What's Changed

Full Changelog: Unstructured-IO/unstructured@0.16.8...0.16.9

0.16.8

Enhancements

  • Metrics: Weighted table average is optional

Features

Fixes

0.16.7

Enhancements

  • Add image_alt_mode to partition_html Adds an image_alt_mode parameter to partition_html() to control how alt text is extracted from images in HTML documents for html_parser_version=v2 . The parameter can be set to to_text to extract alt text as text from <img> html tags

Features

Fixes

0.16.6

... (truncated)

Changelog

Sourced from unstructured's changelog.

0.16.11

Fixes

  • Fix ipv4 regex to correctly include up to three digit octets.

Enhancements

  • Enhance quote standardization tests with additional Unicode scenarios
  • Relax table segregation rule in chunking. Previously a Table element was always segregated into its own pre-chunk such that the Table appeared alone in a chunk or was split into multiple TableChunk elements, but never combined with Text-subtype elements. Allow table elements to be combined with other elements in the same chunk when space allows.
  • Compute chunk length based solely on element.text. Previously .metadata.text_as_html was also considered and since it is always longer that the text (due to HTML tag overhead) it was the effective length criterion. Remove text-as-html from the length calculation such that text-length is the sole criterion for sizing a chunk.

Features

Fixes

0.16.10

Enhancements

Features

Fixes

  • Fix original file doctype detection from cct converted file paths for metrics calculation.

0.16.9

Enhancements

Features

Fixes

  • Fix NLTK Download to not download from unstructured S3 Bucket

0.16.8

Enhancements

  • Metrics: Weighted table average is optional

Features

Fixes

0.16.7

Enhancements

  • Add image_alt_mode to partition_html Adds an image_alt_mode parameter to partition_html() to control how alt text is extracted from images in HTML documents for html_parser_version=v2 . The parameter can be set to to_text to extract alt text as text from <img> html tags

... (truncated)

Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot dependabot bot added the chore label Dec 10, 2024
@github-actions github-actions bot added the dependencies Pull requests that update a dependency file label Dec 10, 2024
@dependabot dependabot bot force-pushed the dependabot/pip/unstructured-0.16.11 branch from 4e43d13 to c9dee87 Compare December 18, 2024 00:51
Bumps [unstructured](https://github.com/Unstructured-IO/unstructured) from 0.10.27 to 0.16.11.
- [Release notes](https://github.com/Unstructured-IO/unstructured/releases)
- [Changelog](https://github.com/Unstructured-IO/unstructured/blob/main/CHANGELOG.md)
- [Commits](Unstructured-IO/unstructured@0.10.27...0.16.11)

---
updated-dependencies:
- dependency-name: unstructured
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot force-pushed the dependabot/pip/unstructured-0.16.11 branch from c9dee87 to 2ecb702 Compare December 20, 2024 04:51
@dependabot @github
Copy link
Contributor Author

dependabot bot commented on behalf of github Jan 6, 2025

Superseded by #200.

@dependabot dependabot bot closed this Jan 6, 2025
@dependabot dependabot bot deleted the dependabot/pip/unstructured-0.16.11 branch January 6, 2025 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

chore dependencies Pull requests that update a dependency file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant