Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Relation property support and Link to pages in rich text support #10

Merged
merged 3 commits into from
Jan 20, 2023

Conversation

Dylfaen
Copy link
Contributor

@Dylfaen Dylfaen commented Dec 28, 2022

If there is a link to a page in a relation property or a link in a rich text, the relation will be picked up by the graph.

To get the relation from a link in a rich text, I remove the leading / of the href to get the page UID, then I try to get the page from the api. If there is an exception, I assume the href was not a link to a notion page and move on.

I had to implement a singleton that logs the relations created to avoid infinite loops.
All parsers that need to log the relation will need to get the id of the parent that calls the parser. It is required by the relation logger.

I'm not sure the API call to retrieve the page is necessary, we can probably just call BlockParser after we add the relation to the logger. I'll try it later and push the improvement.

Another area of improvement is the spacing of the nodes that are related. The edges overlap other edges where it could be avoided.

I also took the liberty of changing the name of parser.py so that it doesn't override the parser module from stdlib as warned by Pylance.

I don't know python so there is certainly a lot of room for improvement, feel free to point out anything that you would like me to fix or clarify.

I had to create a relation_logger singleton that saves the relations already created to avoid looping over and over again on the same bidirectional relations.
In BlockParser, the had_children property now defaults to True because Pages do not provide the has_children propery even if it has children in the form of links in the paragraph. So now, if has_children is not provided, it will try to load the children.

The RichTextParser now is able to look for text blocks that contain a href. It will assume it is a page id and will try to retrieve the page with this id. If an error is thrown, we assume it means it was an http link and we move on.
This part could be improved by match the href to a UUID format before we send the query.

RichTextParser now needs a parent_id so we can add a new relation to the relation_logger.

Some parsers were using sets for linked_blocks and children_ids, which triggered an error when trying to extend it. I assumed it was a mistake and changed them to be lists instead.
@stevedsun
Copy link
Owner

Thanks @Dylfaen . Due to Notion's updates, I have to make a test environment to run your code, and enhance current test cases. Therefore I will merge your request afterwards.

@stevedsun stevedsun merged commit ed9d6af into stevedsun:main Jan 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants