TypeError: the JSON object must be str, not 'bytes' #51

paladini · 2017-05-21T18:56:26Z

I have this issue using comment scraper for public pages. I've filled all variables correctly (app_id, app_secret and page id), have run the post scraper before and it finished successfully.

Following you can see the full error log:

$ python3 get_fb_comments_from_fb.py
Scraping <OMMITED> Comments From Posts: 2017-05-21 15:51:37.768667

Traceback (most recent call last):
  File "get_fb_comments_from_fb.py", line 220, in <module>
    scrapeFacebookPageFeedComments(file_id, access_token)
  File "get_fb_comments_from_fb.py", line 147, in scrapeFacebookPageFeedComments
    comments = json.loads(request_until_succeed(url))
  File "/usr/lib/python3.5/json/__init__.py", line 312, in loads
    s.__class__.__name__))
TypeError: the JSON object must be str, not 'bytes'

The page I'm scraping has posts and comments written in Brazilian Portuguese (PT-BR).

The text was updated successfully, but these errors were encountered:

paladini · 2017-05-21T19:00:00Z

If anyone is having the same issue, I've found how to fix that! Just change the following code from the comments scraper:

def request_until_succeed(url):
    req = Request(url)
    success = False
    while success is False:
        try:
            response = urlopen(req)
            if response.getcode() == 200:
                success = True
        except Exception as e:
            print(e)
            time.sleep(5)

            print("Error for URL {}: {}".format(url, datetime.datetime.now()))
            print("Retrying.")

    return response.read()

To this one (i've added .decode('utf-8') before returning the value):

    req = Request(url)
    success = False
    while success is False:
        try:
            response = urlopen(req)
            if response.getcode() == 200:
                success = True
        except Exception as e:
            print(e)
            time.sleep(5)

            print("Error for URL {}: {}".format(url, datetime.datetime.now()))
            print("Retrying.")

    return response.read().decode('utf-8')

Now it's working fine here, but don't know if it's reliable for everyone, so I'm not going to submit a pull request with this fix.

minimaxir · 2017-05-21T19:01:52Z

The script does encoding/decoding shenanigans in order to be compatible with both Python 2 and 3. I will have to check if that solution will work for Python 2.

paladini · 2017-05-21T19:02:29Z

Thanks for the fast reply, @minimaxir !

Mika15 · 2017-05-31T18:13:45Z

Guys, again I have an issue with paging. Cannot figure out why it is happening. Can you help me? Thanks!
`---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
in ()
176
177 if name == 'main':
--> 178 scrapeFacebookPageFeedStatus(group_id, access_token)

in scrapeFacebookPageFeedStatus(group_id, access_token)
160 if 'paging' in statuses:
161 next_url = statuses['paging']['next']
--> 162 until = re.search('until=([0-9]*?)(&|$)', next_url).group(1)
163 if until is None:
164 return None

AttributeError: 'NoneType' object has no attribute 'group'`

nxy · 2017-09-24T18:23:38Z

@paladini thanks worked for me

minimaxir mentioned this issue May 23, 2017

Broken character with some language(vi) #52

Open

leftyveggie mentioned this issue Nov 21, 2017

Type error runing page-post-scraper #86

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TypeError: the JSON object must be str, not 'bytes' #51

TypeError: the JSON object must be str, not 'bytes' #51

paladini commented May 21, 2017

paladini commented May 21, 2017

minimaxir commented May 21, 2017

paladini commented May 21, 2017

Mika15 commented May 31, 2017

nxy commented Sep 24, 2017