Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Path template variables not correctly parsed with dot #2437

Open
jowlo opened this issue Feb 28, 2023 · 2 comments
Open

[BUG] Path template variables not correctly parsed with dot #2437

jowlo opened this issue Feb 28, 2023 · 2 comments

Comments

@jowlo
Copy link
Contributor

jowlo commented Feb 28, 2023

Bug description

Path template variables are not parsed/populated correctly when using a dot (.) as seperator between placeholders as in /dot/<foo>.<bar>/.

When using /dot/<foo>.<bar>/ as path template, a request to /dot/abc.def should result in a call to layout(foo='abc', bar='def').
Instead there is a call to layout(foo='abc.de', part=None). (Note the missing e at the end)

Using /dash/<foo>-<bar>/ as path template, the request url is handled as expected and variables are parsed/populated correctly.

Expected behavior

Variables should be parsed as expected and correctly independent of characters between seperators.

Minimal example

import dash
from dash import Dash, html, dcc


app = Dash(__name__, use_pages=True, pages_folder="")
dash.register_page(
    "Dot",
    layout=lambda foo=None, bar=None: html.H1([html.B(foo), html.I(bar)]),
    path_template='/dot/<foo>.<bar>'
)

dash.register_page(
    "Dash",
    layout=lambda foo=None, bar=None: html.H1([html.B(foo), html.I(bar)]),
    path_template='/dash/<foo>-<bar>'
)

app.layout = html.Div(
    [
        html.H1("App"),
        html.Div([
            html.Div(dcc.Link(f"{page['name']} - {page['path']}", href=page["path"]))
            for page in dash.page_registry.values()
        ]),
        html.Hr(),
        dash.page_container,
    ]
)

if __name__ == "__main__":
    app.run_server(debug=True)

Environment

dash                2.8.1
@jowlo
Copy link
Contributor Author

jowlo commented Feb 28, 2023

I looked at the code and found that path_template is used as regex pattern in _parse_path_variables after replacing the <variable> by regex groups.

dash/dash/_pages.py

Lines 96 to 120 in d20161d

def _parse_path_variables(pathname, path_template):
"""
creates the dict of path variables passed to the layout
e.g. path_template= "/asset/<asset_id>"
if pathname provided by the browser is "/assets/a100"
returns **{"asset_id": "a100"}
"""
# parse variable definitions e.g. <var_name> from template
# and create pattern to match
wildcard_pattern = re.sub("<.*?>", "*", path_template)
var_pattern = re.sub("<.*?>", "(.*)", path_template)
# check that static sections of the pathname match the template
if not fnmatch(pathname, wildcard_pattern):
return None
# parse variable names e.g. var_name from template
var_names = re.findall("<(.*?)>", path_template)
# parse variables from path
variables = re.findall(var_pattern, pathname)
variables = variables[0] if isinstance(variables[0], tuple) else variables
return dict(zip(var_names, variables))

The documentation does not mention the fact that the path_template is used as regex. While users are probably unlikely to use templates that are translated to vulnerable patterns, using regex under the hood without escaping potentially opens up DoS vectors.

I found that the implementation was jumping more hoops than needed. The re module already provides named capture groups for matches. I pasted my adapted code using those. It replaces the <foo> syntax with a regex group (?P<foo>.*) after escaping all regex characters with re.escape.

Note the re.escape on the (dash-)user supplied path template.

def _parse_path_variables(pathname, path_template):
    """
    creates the dict of path variables passed to the layout
    e.g. path_template= "/asset/<asset_id>"
         if pathname provided by the browser is "/assets/a100"
         returns **{"asset_id": "a100"}
    """

    pattern = re.compile(
        re.sub(
            "<(?P<var>.*?)>", r"(?P<\g<var>>.*)",
            re.escape(path_template)
        )
    )
    match = pattern.match(pathname)
    return match.groupdict() if match else None

I am happy to make this into a PR if you agree in principle.

@alexcjohnson
Copy link
Collaborator

Thanks @jowlo - nice detective work, and your proposal looks good. We would welcome a PR! Try and add a test of the specific bug you encountered with dots in the path; so far the pages tests are all integration tests (mostly in tests/integration/multi_page) but this could also be covered just by unit tests (would just be a new file in tests/unit)

jowlo added a commit to jowlo/dash that referenced this issue Mar 2, 2023
jowlo added a commit to jowlo/dash that referenced this issue Mar 2, 2023
jowlo added a commit to jowlo/dash that referenced this issue Mar 2, 2023
jowlo added a commit to jowlo/dash that referenced this issue Mar 3, 2023
jowlo added a commit to jowlo/dash that referenced this issue Mar 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants