-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Aryn trained DETR model for entity detection #212
Conversation
2f47706
to
5eb3f12
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it not possible to do table extraction if we use this?
def zero(self) -> bool: | ||
return math.isclose(self.x1, self.x2, rel_tol=1e-6) or math.isclose(self.y1, self.y2, rel_tol=1e-6) | ||
|
||
def iou(self, other: "BoundingBox") -> float: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: This is "intersection over union", right? Maybe leave a comment so people know what to search for.
|
||
@staticmethod | ||
def _supplement_text(inferred: List[Element], text: List[Element], threshold: float = 0.5) -> List[Element]: | ||
# this is a n^2 time complexity, but for hungarian, it's too expensive? n^3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give a little more context here on your choices? From Googling I see that the Hungarian algorithm is indeed O(n^3), but it's not very clear to me what algorithm we used that is O(n^2). Is it the same as Unstructured? What are the cases where this doesn't work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, similar but simplified, I only supplement text element to the inferred ones.
return inferred | ||
|
||
def partition_pdf(self, file: BinaryIO, threshold: float = 0.4) -> List[List["Element"]]: | ||
with tempfile.TemporaryDirectory() as tmp_dir, tempfile.NamedTemporaryFile() as tmp_file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we write to a temporary file because pdf2image/pdfminer/detr expect it? Do they only support names, or can you pass in a BinaryIO object? I know pd2image supports binary (
sycamore/sycamore/functions/document.py
Line 38 in 725adf8
images = pdf2image.convert_from_bytes(doc.binary_representation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not make it work directly from bytes, would try couple more times.
results = self.processor.post_process_object_detection(outputs, target_sizes=target_sizes, threshold=threshold)[ | ||
0 | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess since this passes linting this must be what black
generated, right? What a weird format for array indexing :)
5eb3f12
to
357a4d3
Compare
357a4d3
to
a386b22
Compare
No description provided.