Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update URLConstructor.java #322

Merged
merged 1 commit into from
Nov 25, 2023
Merged

Conversation

udittmer
Copy link
Contributor

The method didn't work for the default URL style, causing PAGE_REQUESTED and PAGE_DELIVERED events to have a null page name.

The method didn't work for the default URL style, causing PAGE_REQUESTED and PAGE_DELIVERED events to have a null page name.
Copy link
Contributor

@juanpablo-santos juanpablo-santos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@juanpablo-santos juanpablo-santos merged commit 432361b into apache:master Nov 25, 2023
1 check passed
@juanpablo-santos
Copy link
Contributor

Added in 2.12.2-git-09, thanks!

@udittmer
Copy link
Contributor Author

udittmer commented Nov 30, 2023

This patch actually causes problems: something is off with encoding, so that umlauts are mangled during saving. This happens because of the call to getParameter. If (in WikiJSPFIlter) the method is called before the super.doFilter call, the problem occurs. After doFilter, calling it causes no problem. Rewriting the method so it looks at the query string solves the problem:

   static String parsePageFromURL( final HttpServletRequest request, final Charset encoding ) {
        String name = request.getPathInfo();
        if( name == null || name.length() <= 1 ) {
            name = request.getQueryString();
		if (StringUtils.isNotBlank(name) && name.contains("page="))
			return name.substring(name.indexOf("page=")+5);
		else
			return null;
        } else if( name.charAt(0) == '/' ) {
            return name.substring(1);
        }

        //  This is required, because by default all URLs are handled as Latin1, even if they are really UTF-8.
        // name = TextUtil.urlDecode( name, encoding );

        return name;
    }
}

But I don't know enough of what is happening here to understand whether that is the right thing to do. Apparently, calling getParameter has side effects that affect encoding.

juanpablo-santos added a commit that referenced this pull request Dec 2, 2023
the latter commits the response, causing errors later on the filter chain, f.ex. when trying to create a session b/c of the response has being previously committed - fixes error related to #322
@juanpablo-santos
Copy link
Contributor

Hi @udittmer,

I think that the problem comes from using response.getEncoding(), which causes the response to be committed, which in turn causes errors later on the filter chain, f.ex., when trying to create a session. 2.12.2-git-10 uses engine.getEncoding() instead, which is the parameter used elsewhere. This seems to fix the issue, would you mind verifying it with latest master?

thx + best regards,

@udittmer
Copy link
Contributor Author

udittmer commented Dec 3, 2023

No, that doesn't make a difference - any umlauts being saved are mangled. My encoding is UTF-8, in case that makes a difference, and useEncoding is true.

I played around a bit with encoding handling in WikiJSPFilter, but I'm confused with how useEncoding, m_wiki_encoding, m_engine.getContentEncoding() and response.getCharacterEncoding() are supposed to work together. m_wiki_encoding gets initialized, but doesn't seem to be used.

The strange thing is that umlauts are shown fine in the preview.

juanpablo-santos added a commit that referenced this pull request Dec 4, 2023
…tor#parsePageFrom URL

in order to ensure the proper encoding is set.
(related to error noted at #322)
@juanpablo-santos
Copy link
Contributor

Hi @udittmer ! would you mind trying 2.12.2-git-11 to see if the issue persists?

The next filter on the filter chain after WikiJSPFilter is WikiServletFilter, which, early on, is doing this:

// Set the character encoding
httpRequest.setCharacterEncoding( m_engine.getContentEncoding().displayName() );

a quick peek at that javadoc method yields

Overrides the name of the character encoding used in the body of this request. This method must be called prior to reading request parameters or reading input using getReader(). Otherwise, it has no effect.

So the initial fix was most probably rendering this call useless, hence the errors you were expecting. I've added that call prior to the URLConstructor call and now everything behaves as expected. I've left the setCharacterEncoding call also in WikiServletFilter, just in case a call goes through that filter without going through WikiJSPFilter first (like most probably on the preview pane).

HTH!

@udittmer
Copy link
Contributor Author

udittmer commented Dec 5, 2023

Yes! Now saving umlauts works fine. Thank you very much!

@udittmer udittmer deleted the patch-2 branch June 24, 2024 08:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants