Update URLConstructor.java #322

udittmer · 2023-11-24T15:33:01Z

The method didn't work for the default URL style, causing PAGE_REQUESTED and PAGE_DELIVERED events to have a null page name.

juanpablo-santos

lgtm

juanpablo-santos · 2023-11-25T14:20:11Z

Added in 2.12.2-git-09, thanks!

udittmer · 2023-11-30T08:18:22Z

This patch actually causes problems: something is off with encoding, so that umlauts are mangled during saving. This happens because of the call to getParameter. If (in WikiJSPFIlter) the method is called before the super.doFilter call, the problem occurs. After doFilter, calling it causes no problem. Rewriting the method so it looks at the query string solves the problem:

   static String parsePageFromURL( final HttpServletRequest request, final Charset encoding ) {
        String name = request.getPathInfo();
        if( name == null || name.length() <= 1 ) {
            name = request.getQueryString();
		if (StringUtils.isNotBlank(name) && name.contains("page="))
			return name.substring(name.indexOf("page=")+5);
		else
			return null;
        } else if( name.charAt(0) == '/' ) {
            return name.substring(1);
        }

        //  This is required, because by default all URLs are handled as Latin1, even if they are really UTF-8.
        // name = TextUtil.urlDecode( name, encoding );

        return name;
    }
}

But I don't know enough of what is happening here to understand whether that is the right thing to do. Apparently, calling getParameter has side effects that affect encoding.

the latter commits the response, causing errors later on the filter chain, f.ex. when trying to create a session b/c of the response has being previously committed - fixes error related to #322

juanpablo-santos · 2023-12-02T20:05:07Z

Hi @udittmer,

I think that the problem comes from using response.getEncoding(), which causes the response to be committed, which in turn causes errors later on the filter chain, f.ex., when trying to create a session. 2.12.2-git-10 uses engine.getEncoding() instead, which is the parameter used elsewhere. This seems to fix the issue, would you mind verifying it with latest master?

thx + best regards,

udittmer · 2023-12-03T12:24:26Z

No, that doesn't make a difference - any umlauts being saved are mangled. My encoding is UTF-8, in case that makes a difference, and useEncoding is true.

I played around a bit with encoding handling in WikiJSPFilter, but I'm confused with how useEncoding, m_wiki_encoding, m_engine.getContentEncoding() and response.getCharacterEncoding() are supposed to work together. m_wiki_encoding gets initialized, but doesn't seem to be used.

The strange thing is that umlauts are shown fine in the preview.

…tor#parsePageFrom URL in order to ensure the proper encoding is set. (related to error noted at #322)

juanpablo-santos · 2023-12-04T21:49:29Z

Hi @udittmer ! would you mind trying 2.12.2-git-11 to see if the issue persists?

The next filter on the filter chain after WikiJSPFilter is WikiServletFilter, which, early on, is doing this:

// Set the character encoding
httpRequest.setCharacterEncoding( m_engine.getContentEncoding().displayName() );

a quick peek at that javadoc method yields

Overrides the name of the character encoding used in the body of this request. This method must be called prior to reading request parameters or reading input using getReader(). Otherwise, it has no effect.

So the initial fix was most probably rendering this call useless, hence the errors you were expecting. I've added that call prior to the URLConstructor call and now everything behaves as expected. I've left the setCharacterEncoding call also in WikiServletFilter, just in case a call goes through that filter without going through WikiJSPFilter first (like most probably on the preview pane).

HTH!

udittmer · 2023-12-05T10:26:54Z

Yes! Now saving umlauts works fine. Thank you very much!

Update URLConstructor.java

f62cb81

The method didn't work for the default URL style, causing PAGE_REQUESTED and PAGE_DELIVERED events to have a null page name.

juanpablo-santos approved these changes Nov 25, 2023

View reviewed changes

juanpablo-santos merged commit 432361b into apache:master Nov 25, 2023
1 check passed

juanpablo-santos added a commit that referenced this pull request Dec 4, 2023

Call ServletRequest#setCharacterEncoding prior to calling URLConstruc…

abd3d8b

…tor#parsePageFrom URL in order to ensure the proper encoding is set. (related to error noted at #322)

udittmer deleted the patch-2 branch June 24, 2024 08:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update URLConstructor.java #322

Update URLConstructor.java #322

udittmer commented Nov 24, 2023

juanpablo-santos left a comment

juanpablo-santos commented Nov 25, 2023

udittmer commented Nov 30, 2023 •

edited

Loading

juanpablo-santos commented Dec 2, 2023

udittmer commented Dec 3, 2023 •

edited

Loading

juanpablo-santos commented Dec 4, 2023

udittmer commented Dec 5, 2023

Update URLConstructor.java #322

Update URLConstructor.java #322

Conversation

udittmer commented Nov 24, 2023

juanpablo-santos left a comment

Choose a reason for hiding this comment

juanpablo-santos commented Nov 25, 2023

udittmer commented Nov 30, 2023 • edited Loading

juanpablo-santos commented Dec 2, 2023

udittmer commented Dec 3, 2023 • edited Loading

juanpablo-santos commented Dec 4, 2023

udittmer commented Dec 5, 2023

udittmer commented Nov 30, 2023 •

edited

Loading

udittmer commented Dec 3, 2023 •

edited

Loading