Skip to content

Add ability to remove dimension links#623

Merged
shangyian merged 8 commits intoDataJunction:mainfrom
shangyian:remove-dimension-links
Jul 12, 2023
Merged

Add ability to remove dimension links#623
shangyian merged 8 commits intoDataJunction:mainfrom
shangyian:remove-dimension-links

Conversation

@shangyian
Copy link
Copy Markdown
Collaborator

@shangyian shangyian commented Jul 12, 2023

Summary

This adds a new API endpoint that allows users to remove a dimension linked to a node's column:

DELETE /nodes/{name}/columns/{column}

It also adds the corresponding Python client call:

source.unlink_dimension("payment_type", "default.payment_type", "payment_type_id")

Test Plan

Ran locally. Added tests.

Deployment Plan

@netlify
Copy link
Copy Markdown

netlify bot commented Jul 12, 2023

Deploy Preview for thriving-cassata-78ae72 canceled.

Name Link
🔨 Latest commit 620c58f
🔍 Latest deploy log https://app.netlify.com/sites/thriving-cassata-78ae72/deploys/64aef97ff4bf8a0008a9e9bc

@shangyian shangyian marked this pull request as ready for review July 12, 2023 03:23
@samredai
Copy link
Copy Markdown
Contributor

@shangyian this looks great. One thing I started thinking about as I was reviewing is that removing a dimension link could break a cube definition. Any thoughts on the best way to handle that? Since dimension links can have far reaching effects (reveal itself to a metric, reveal other dimensions to a metric, make various metrics combinable into a single cube), there doesn't seem to be an easy way to assess the damage to all existing cubes.

@shangyian
Copy link
Copy Markdown
Collaborator Author

shangyian commented Jul 12, 2023

@samredai That's a good point. It's definitely possible to add the check, although it fits into the "which nodes are available for this dimension" question, which would require us to traverse the entire graph. But I think it's worth adding since we need to be able to keep the valid/invalid state up-to-date.

I suspect something like this will work, but I need to do more testing:

def get_nodes_by_dimension(
    session: Session,
    dimension_node: Node,
    node_types: Optional[List[NodeType]] = None,
) -> List[NodeRevision]:
    """
    Find all nodes that can be joined to a given dimension
    """
    to_process = [dimension_node]
    processed: Set[str] = set()
    final_set: Set[NodeRevision] = set()
    while to_process:
        current_node = to_process.pop()
        processed.add(current_node.name)
        final_set.add(current_node.current)

        if current_node.type == NodeType.DIMENSION:
            statement = (
                select(NodeRevision)
                .join(NodeColumns, onclause=(NodeRevision.id == NodeColumns.node_id))
                .join(Column, onclause=(NodeColumns.column_id == Column.id))
                .where(Column.dimension_id.in_([current_node.id]))  # type: ignore
            )
            node_revisions = session.exec(statement).unique().all()
            for node_rev in node_revisions:
                to_process.append(node_rev.node)

    all_nodes = list(set([child for node_rev in final_set for child in node_rev.node.children]).union(final_set))
    if node_types:
        return [node for node in all_nodes if node.type in node_types]
    return all_nodes

If it turns out to be too slow, another option is that we can have the reflection service come in and refresh the validity of each node, which would be a non-blocking process that doesn't affect users.

Copy link
Copy Markdown
Member

@agorajek agorajek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Thanks @shangyian . Two small naming questions inline.

)
return response.json()

def unlink_dimension_to_node(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think unlink_dimension_from_node makes a bit more sense, no?

Also if this is a helper function should we start adding some _ in front for good measure?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I changed the name and prefixed it with _

session: Session = Depends(get_session),
) -> JSONResponse:
"""
Add information to a node column
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment seems off.

@samredai
Copy link
Copy Markdown
Contributor

samredai commented Jul 12, 2023

If it turns out to be too slow, another option is that we can have the reflection service come in and refresh the validity of each node, which would be a non-blocking process that doesn't affect users.

I'm starting to lean more and more towards the latter option--having something else do a retrospective validation. In the future we can consider making dimension links bidirectional but for the time being I think that's too strong of a connection. Plus I think that would only even solve a portion of the performance issue.

Copy link
Copy Markdown
Contributor

@samredai samredai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@shangyian
Copy link
Copy Markdown
Collaborator Author

@samredai I think that function I have actually does work (just did some more testing), but I'll go ahead and add it in a separate PR because it pairs well with some additional endpoints that expose that logic (#321).

@shangyian shangyian merged commit a22deda into DataJunction:main Jul 12, 2023
@shangyian shangyian deleted the remove-dimension-links branch July 12, 2023 21:11
@samredai
Copy link
Copy Markdown
Contributor

@shangyian nice!!

youngman-droid pushed a commit to youngman-droid/dj that referenced this pull request Aug 26, 2023
…inks

Add ability to remove dimension links
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants