JSONDecodeError: Expecting value; line1 column 1(char0) 오류 #31

chang2eee · 2023-03-24T12:28:27Z

안녕하세요. 기존에 hanspell을 사용하여 프로젝트를 하고 있는 컴퓨터공학과 학생입니다.

다름이 아니라, 며칠 전까지만 해도 오류 없이 잘 실행되던 코드가 제목과 같이 오류가 생성되어 이렇게 issue를 남기게 되었습니다.

혹시 몰라, try-except 예외처리를 해줬는데, 오류 없이 실행은 되긴 하지만, 맞춤법이 수정되지 않고, 공백만 출력되었습니다.

관련하여, solution 주실 수 있으신 분 계시면 감사하겠습니다.

uiandwe · 2023-03-27T06:07:47Z

hanspell에서 사용하는 네이버 맞춤법의 url이 변경되었습니다.
설치된 패키지의 constants.py의 base 파라미터를 다음과 같이 변경해 주시면 됩니다.

base_url = "https://m.search.naver.com/p/csearch/ocontent/util/SpellerProxy"

(풀리퀘 요청은 했으나 언제 머지 될지는 모르겠네요.)

YoonseongHer · 2023-03-29T23:40:46Z

base_url = "https://m.search.naver.com/p/csearch/ocontent/util/SpellerProxy"

이 url로 변경하였는데도 같은 에러가 발생합니다

YoonseongHer · 2023-03-30T00:13:11Z

해결방법을 찾았습니다.

hanspell 파일에서 아래 3가지 사항들을 수정해주시면 됩니다.

constants.py 의 base_url = "https://m.search.naver.com/p/csearch/ocontent/util/SpellerProxy"로 수정
spell_checker.py 에서 payload 수정

payload = {
        '_callback': 'window.__jindo2_callback._spellingCheck_0',
        'q': text
    }

에서

payload = {
        '_callback': 'jQuery11240003383472025177525_1680133565087',
        'q': text,
        'where': 'nexearch',
        'color_blindness': 0
    }

로
3. spell_checker.py 약 61번째줄
r = r.text[42:-2] 를 r=r.text[44:-2]로 수정

hanspell 파일 위치 찾는 명령어
pip show hanspell

jso4342 · 2023-03-31T04:31:20Z

@YoonseongHer 님 방법으로 하니 해결되네요! 감사합니다.

uiandwe · 2023-04-03T00:07:37Z

spell_checker.py 약 61번째줄
r = r.text[42:-2] 를 r=r.text[44:-2]로 수정

(-2로 하셔야 합니다.)

jian1114 · 2023-04-06T10:26:54Z

위 3가지를 모두 적용했는데도 오류가 뜹니다ㅜㅜ

HardenKim · 2023-04-06T14:00:19Z

@YoonseongHer 님이 알려주신 코드 기반으로 spell_checker.py와 constants.py 파일을 아래 내용으로 변경하시면 됩니다.

constants.py

base_url = 'https://m.search.naver.com/p/csearch/ocontent/util/SpellerProxy'


class CheckResult:
    PASSED = 0
    WRONG_SPELLING = 1
    WRONG_SPACING = 2
    AMBIGUOUS = 3
    STATISTICAL_CORRECTION = 4

spell_checker.py

# -*- coding: utf-8 -*-
"""
Python용 한글 맞춤법 검사 모듈
"""

import requests
import json
import time
import sys
from collections import OrderedDict
import xml.etree.ElementTree as ET

from . import __version__
from .response import Checked
from .constants import base_url
from .constants import CheckResult

_agent = requests.Session()
PY3 = sys.version_info[0] == 3


def _remove_tags(text):
    text = u'<content>{}</content>'.format(text).replace('<br>','')
    if not PY3:
        text = text.encode('utf-8')

    result = ''.join(ET.fromstring(text).itertext())

    return result


def check(text):
    """
    매개변수로 입력받은 한글 문장의 맞춤법을 체크합니다.
    """
    if isinstance(text, list):
        result = []
        for item in text:
            checked = check(item)
            result.append(checked)
        return result

    # 최대 500자까지 가능.
    if len(text) > 500:
        return Checked(result=False)

    payload = {
        '_callback': 'jQuery11240003383472025177525_1680133565087',
        'q': text,
        'where': 'nexearch',
        'color_blindness': 0
    }

    headers = {
        'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36',
        'referer': 'https://search.naver.com/',
    }

    start_time = time.time()
    r = _agent.get(base_url, params=payload, headers=headers)
    passed_time = time.time() - start_time

    r = r.text[44:-2]

    data = json.loads(r)
    html = data['message']['result']['html']
    result = {
        'result': True,
        'original': text,
        'checked': _remove_tags(html),
        'errors': data['message']['result']['errata_count'],
        'time': passed_time,
        'words': OrderedDict(),
    }

    # 띄어쓰기로 구분하기 위해 태그는 일단 보기 쉽게 바꿔둠.
    # ElementTree의 iter()를 써서 더 좋게 할 수 있는 방법이 있지만
    # 이 짧은 코드에 굳이 그렇게 할 필요성이 없으므로 일단 문자열을 치환하는 방법으로 작성.
    html = html.replace('<span class=\'green_text\'>', '<green>') \
               .replace('<span class=\'red_text\'>', '<red>') \
               .replace('<span class=\'purple_text\'>', '<purple>') \
               .replace('<span class=\'blue_text\'>', '<blue>') \
               .replace('</span>', '<end>')
    items = html.split(' ')
    words = []
    tmp = ''
    for word in items:
        if tmp == '' and word[:1] == '<':
            pos = word.find('>') + 1
            tmp = word[:pos]
        elif tmp != '':
            word = u'{}{}'.format(tmp, word)
        
        if word[-5:] == '<end>':
            word = word.replace('<end>', '')
            tmp = ''

        words.append(word)

    for word in words:
        check_result = CheckResult.PASSED
        if word[:5] == '<red>':
            check_result = CheckResult.WRONG_SPELLING
            word = word.replace('<red>', '')
        elif word[:7] == '<green>':
            check_result = CheckResult.WRONG_SPACING
            word = word.replace('<green>', '')
        elif word[:8] == '<purple>':
            check_result = CheckResult.AMBIGUOUS
            word = word.replace('<purple>', '')
        elif word[:6] == '<blue>':
            check_result = CheckResult.STATISTICAL_CORRECTION
            word = word.replace('<blue>', '')
        result['words'][word] = check_result

    result = Checked(**result)

    return result

Nyukist · 2023-04-07T01:38:42Z

@jian1114

수정하시고 py-hanspell 내에서 다시 python setup.py install 해보셨을까요?

jungin500 · 2023-04-07T05:12:19Z

제가 수정한 버전으로 한번 테스트해보실 수 있을까요?
제 구현의 경우에는 string split을 하지 않고 JSON response를 그대로 가져다 사용했습니다.
일단 저는 문제없이 작동하지만 테스트를 좀더 해볼 필요가 있을것 같아서 부탁드립니다.

수정된 버전 설치 (설치된 기존 버전은 삭제됩니다): pip install git+https://github.com/jungin500/py-hanspell
PR: 403 에러 수정 및 변경된 HTML Tag 반영 #34

jiin124 · 2023-04-20T07:46:58Z

@jungin500 제가 VScode 주피터 환경에서 실행해본 결과 실행되지 않습니다 ㅠㅜ

jiin124 · 2023-04-21T02:13:17Z

@jungin500 헉 껐다가 키니까 실행됩니다!!! 감사합니다😁🥰

jian1114 · 2023-04-22T05:58:46Z

@Nyukist 제가 너무 늦게 봤네요ㅜㅜ껐다가 키니까 실행됩니다! 감사합니다!!

miiiiiion0 · 2023-09-12T07:58:47Z

@jungin500 기존에 공유주신 버전으로 설치해서 잘 사용하고 있었는데, 금일부터 동일한 오류가 발생합니다.
혹시 확인 가능하실까요?

jungin500 · 2023-09-12T08:05:08Z

@jungin500 기존에 공유주신 버전으로 설치해서 잘 사용하고 있었는데, 금일부터 동일한 오류가 발생합니다. 혹시 확인 가능하실까요?

@miiiiiion0 현재 제 환경에서는 증상 확인이 어렵습니다. Error log나 예제 등 구체적인 증상 공유 부탁드립니다.

miiiiiion0 · 2023-09-12T08:07:08Z

@jungin500 상단과 동일하게 JSONDecodeError: Expecting value: line 1 column 1 (char 0)
이러한 오류 코드가 발생합니다. 데이터 프레임에 apply 함수 적용하였습니다.

jungin500 · 2023-09-12T08:31:54Z

@jungin500 상단과 동일하게 JSONDecodeError: Expecting value: line 1 column 1 (char 0) 이러한 오류 코드가 발생합니다. 데이터 프레임에 apply 함수 적용하였습니다.

@miiiiiion0 hanspell 패키지 재설치를 시도해 보시는것이 좋습니다. 특히나 기존에 잘 작동되시다가 어느 시점 이후로 안되시는 경우라면 다른 library의 dependency로 인하여 py-hanspell의 버전이 다운그레이드 되었을 수 있습니다.

먼저 pip uninstall -y py-hanspell hanspell로 패키지 삭제를 진행합니다.
다음으로, pip install git+https://github.com/ssut/py-hanspell로 다시금 설치를 시도합니다.

위 재설치 과정 후 결과를 확인해보시는 것이 좋습니다.

miiiiiion0 · 2023-09-13T00:59:27Z

@jungin500 말씀주신 부분대로 진행해도 동일한 오류가 발생하네요 ㅠㅠ 코랩 환경에서 하고 있고,
지난번에 올려주신 ! pip install git+https://github.com/jungin500/py-hanspell
설치하였고, ! pip uninstall -y py-hanspell hanspell 삭제하니, WARNING: Skipping hanspell as it is not installed. 이라는 메시지가 뜨네요.

pip install git+https://github.com/ssut/py-hanspell 이후 해당 코드로 다시 설치도 해보았습니다.

사전에 공유주신 설치 파일(! pip install git+https://github.com/jungin500/py-hanspell)에 변경사항이 있을까요?

chang2eee mentioned this issue Mar 25, 2023

spell_checker 적용하신분중 정상작동하시나요 #30

Open

yunjinchoidev mentioned this issue Apr 24, 2023

main baseline 실행시 hanspell 라이브러리 import 관련 내용. (2023/04/19) boostcamp-5th-NLP05/level1_semantictextsimilarity-nlp-05#18

Open

yunjinchoidev mentioned this issue May 25, 2023

[FEAT] input text noise 에 대해서 hanspell 을 이용해서 처리 한 후 drop duplicate boostcampaitech5/level2_nlp_datacentric-nlp-07#11

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JSONDecodeError: Expecting value; line1 column 1(char0) 오류 #31

JSONDecodeError: Expecting value; line1 column 1(char0) 오류 #31

chang2eee commented Mar 24, 2023

uiandwe commented Mar 27, 2023

YoonseongHer commented Mar 29, 2023

YoonseongHer commented Mar 30, 2023 •

edited

jso4342 commented Mar 31, 2023

uiandwe commented Apr 3, 2023

jian1114 commented Apr 6, 2023

HardenKim commented Apr 6, 2023

Nyukist commented Apr 7, 2023

jungin500 commented Apr 7, 2023

jiin124 commented Apr 20, 2023

jiin124 commented Apr 21, 2023

jian1114 commented Apr 22, 2023

miiiiiion0 commented Sep 12, 2023

jungin500 commented Sep 12, 2023 •

edited

miiiiiion0 commented Sep 12, 2023

jungin500 commented Sep 12, 2023 •

edited

miiiiiion0 commented Sep 13, 2023

JSONDecodeError: Expecting value; line1 column 1(char0) 오류 #31

JSONDecodeError: Expecting value; line1 column 1(char0) 오류 #31

Comments

chang2eee commented Mar 24, 2023

uiandwe commented Mar 27, 2023

YoonseongHer commented Mar 29, 2023

YoonseongHer commented Mar 30, 2023 • edited

jso4342 commented Mar 31, 2023

uiandwe commented Apr 3, 2023

jian1114 commented Apr 6, 2023

HardenKim commented Apr 6, 2023

Nyukist commented Apr 7, 2023

jungin500 commented Apr 7, 2023

jiin124 commented Apr 20, 2023

jiin124 commented Apr 21, 2023

jian1114 commented Apr 22, 2023

miiiiiion0 commented Sep 12, 2023

jungin500 commented Sep 12, 2023 • edited

miiiiiion0 commented Sep 12, 2023

jungin500 commented Sep 12, 2023 • edited

miiiiiion0 commented Sep 13, 2023

YoonseongHer commented Mar 30, 2023 •

edited

jungin500 commented Sep 12, 2023 •

edited

jungin500 commented Sep 12, 2023 •

edited