Skip to content

Commit edbfe0e

Browse files
committed
update
1 parent 1c29f04 commit edbfe0e

File tree

1 file changed

+23
-0
lines changed

1 file changed

+23
-0
lines changed

nlp_class3/convert_twitter.py

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# https://deeplearningcourses.com/c/deep-learning-advanced-nlp
2+
from __future__ import print_function, division
3+
from builtins import range, input
4+
# Note: you may need to update your version of future
5+
# sudo pip install -U future
6+
7+
8+
# each output line should be:
9+
# INPUT<tab>RESPONSE
10+
with open('../large_files/twitter_tab_format.txt', 'w') as f:
11+
prev_line = None
12+
# data source: https://github.com/Phylliida/Dialogue-Datasets
13+
for line in open('../large_files/TwitterLowerAsciiCorpus.txt'):
14+
line = line.rstrip()
15+
16+
if prev_line and line:
17+
f.write("%s\t%s\n" % (prev_line, line))
18+
19+
# note:
20+
# between conversations there are empty lines
21+
# which evaluate to false
22+
23+
prev_line = line

0 commit comments

Comments
 (0)