Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content overview in search page don't work correctly for some languages. #1213

Open
tohort opened this issue Sep 23, 2022 · 0 comments
Open

Comments

@tohort
Copy link

tohort commented Sep 23, 2022

I'm reporting a bug in search page.

I use japanese in django-wiki, but I found that content overview in search page is not work correctly especially in case using html tags in the content.

I realized that get_content_snippet() in wiki-tags.py is optimized only for the language using space between words, but Japanese don't use space between words.

I would like to propose the following in spite of current get_content_snippet()

@register.filter
def search_result_format(content, keyword, max_letters=300):

    content = striptags(content)
    content =re.sub(r'\r\n', ' ', content)
    content =re.sub(r'\n', ' ', content)
    keyword=re.escape(keyword)

    result=re.search(keyword, content, re.IGNORECASE)

    startposition=result.start()-int(max_letters/2)
    if startposition<0: startposition=0

    endposition=startposition+max_letters
    if endposition>len(content): endposition=len(content)

    result_content=content[startposition:endposition]

    if startposition>0:        
        #for the languages using space between words
        spase_position=result_content.find(" ")
        if spase_position >=0 and spase_position < 15: 
            result_content=result_content[spase_position:]

        result_content="...."+result_content

    if endposition<len(content):
        #for the languages using space between words
        spase_position=result_content.rfind(" ")
        if spase_position >=0 and len(result_content)-spase_position < 15: 
            result_content=result_content[:spase_position]

        result_content=result_content+"...."

    #html markup
    result_content=re.sub(r'('+keyword+')',r'<strong style="background:#ddd">\1</strong>',result_content ,flags=re.IGNORECASE)
    
    return result_content
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant