Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

原始資料中出現 backspace (\x08) #15

Closed
eight04 opened this issue Jun 18, 2018 · 2 comments
Closed

原始資料中出現 backspace (\x08) #15

eight04 opened this issue Jun 18, 2018 · 2 comments

Comments

@eight04
Copy link

eight04 commented Jun 18, 2018

image

PTT 在輸出特殊字元如「♥」的時候,會印出

  1. 兩個格空格
  2. 兩個 backspace
  3. 該特殊字元
  4. 移位碼

用意是抹掉原先在畫面上的文字,並確保輸出時的寬度保持在整數(該特殊符號的寬度介於 1~2 之間)

@Truth1987
Copy link

getRawData 就是給你原始資料阿@@

@eight04
Copy link
Author

eight04 commented Jun 19, 2018

我建議在資料還是 bytes 的時候就把 backspace 處理掉(退格,遇到一個 backspace 就刪除一個字元)。另外直接移除 backspace 的方法也不太正確,例如這篇文章︰
#1RA8YbnY (Test) [ptt.cc] [測試]
https://www.ptt.cc/bbs/Test/M.1529383077.A.C62.html

用以下的程式抓取文章內容︰

from pprint import pprint
from PTTLibrary import PTT

bot = PTT.Library(kickOtherLogin=False)
bot.login(USER, PASSWORD)
err, post = bot.getPost("test", "1RA8YbnY")
pprint((
	post.getBoard(),
	post.getID(),
	post.getAuthor(),
	post.getContent()
))
[06-19 12:38:08][資訊] 偵測到前景執行使用編碼: utf-8
[06-19 12:38:08][資訊] 使用者帳號:
[06-19 12:38:08][資訊] 密碼:
[06-19 12:38:08][資訊] 產生 SSH 金鑰完成
[06-19 12:38:08][資訊] 連線頻道 0 啟動
[06-19 12:38:24][資訊] 頻道 0 建立互動通道成功
[06-19 12:38:24][資訊] 頻道 0 輸入帳號
[06-19 12:38:24][資訊] 頻道 0 輸入密碼
[06-19 12:38:24][資訊] 頻道 0 讀取 PTT 畫面..
[06-19 12:38:24][資訊] 不刪除重複登入的連線
[06-19 12:38:29][資訊] 任意鍵繼續
[06-19 12:38:29][資訊] 頻道 0 登入成功
('test',
 '1RA8YbnY',
 'eight0 (人類)',
 '前半部  ♥後半部\n\n--\nヾ(;  ω;) ヾ(;  ω;)\n\nhttp://i.imgur.com/oAd97.
png\n\n--')

可以看到在「♥」之前多了兩個空格。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants