Permalink
Browse files

add util script for data cleaning

  • Loading branch information...
1 parent 7d0c715 commit 21aeecb72ad4f0dc1d1b24ba73a87b788be771e9 Daniel Erenrich committed Nov 18, 2011
Showing with 5 additions and 0 deletions.
  1. +5 −0 frombulate.sh
View
@@ -0,0 +1,5 @@
+cat data.csv | grep -v " " | shuf > clean_data.csv
+cat clean_data.csv | head -n -50000 | cut -d" " -f 1,2 --complement > train_features.csv
+cat clean_data.csv | head -n -50000 | cut -d" " -f 1,2 > train_labels.csv
+cat clean_data.csv | tail -n 50000 | cut -d" " -f 1,2 --complement > test_features.csv
+cat clean_data.csv | tail -n 50000 | cut -d" " -f 1,2 > test_labels.csv

0 comments on commit 21aeecb

Please sign in to comment.