Skip to content

Elwing-Chou/ml0602

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

92 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

深度學習

Word2Vec

資料集

PTT小資料集

PTT大資料集

標點符號去除

punct = set(u''':!),.:;?]}¢'"、。〉》」』】〕〗〞︰︱︳﹐、﹒﹔﹕﹖﹗﹚﹜﹞!),.:;?|}︴︶︸︺︼︾﹀﹂﹄﹏、~¢々‖•·ˇˉ―--′’”([{£¥'"‵〈《「『【〔〖([{£¥〝︵︷︹︻︽︿﹁﹃﹙﹛﹝({“‘-—_…~/ -*➜■─★☆=@<>◉é''')
filter(lambda x: x not in punct, jieba.cut(content))

網址Regex

content = re.sub(r'https?:\/\/.*[\r\n]*', '', content)

推薦文章

w2v

Face(GPU)

FastText

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published