-
Notifications
You must be signed in to change notification settings - Fork 411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feat] issue #1 exclude tags (html clean-up) #16
base: main
Are you sure you want to change the base?
[Feat] issue #1 exclude tags (html clean-up) #16
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added back the tags that must be removed by default
CI/CD is failing because we hit the llamaparse rate limit |
@rafaelsideguide Now that we have an initial testing framework, we should start testing these changes and get this merged. also @oliviermills quick thing, I noticed that this pr was made before we switched to AGPL 3.0, can you just confirm that you agree to relicense your contributions under the new license? Once that's done, we can proceed to merging them! (you can just write a comment here saying "I agree to relicense my contributions to the AGPL" - see #134 for more context) Thank you. |
I agree to relicense my contributions to the AGPL 🤗 |
Replaced the exclude tag list with a function that does nicer and safer clean up. Resolves #1
Added basics tests for the function.
Important: should add an integration test with a much larger variety of html pages see #15