Skip to content

Commit 72ba1d8

Browse files
committed
Update tokenizing_test.py
1 parent fce2fd2 commit 72ba1d8

File tree

1 file changed

+6
-0
lines changed

1 file changed

+6
-0
lines changed

tokenizing_test.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,12 @@ def main(data_path: str):
2424
text = df[0].apply(lambda x: x.strip()).tolist()
2525

2626
for i, t in enumerate(text):
27+
28+
for x in range(10):
29+
print(t[x], flush=True, end="|")
30+
break
31+
32+
print(type(t))
2733
if i % 100_000 == 0:
2834
print(f"At {i}, {len(text) - i} to go.")
2935
tokenizer(t)

0 commit comments

Comments
 (0)