# Assignment 5 - Text Analysis
An explanation this assignment could be found in the .pdf explanation document


## Materials to review for this assignment
<h4>From Moodle:</h4> 
<h5><u>Review the notebooks regarding the following python topics</u>:</h5>
<div class="alert alert-info">
&#x2714; <b>Working with strings</b> (tutorial notebook)<br/>
&#x2714; <b>Text Analysis</b> (tutorial notebook)<br/>
&#x2714; <b>Hebrew text analysis tools (tokenizer, wordnet)</b> (moodle example)<br/>
&#x2714; <b>(brief review) All previous notebooks</b><br/>
</div> 
<h5><u>Review the presentations regarding the following topics</u>:</h5>
<div class="alert alert-info">
&#x2714; <b>Text Analysis</b> (lecture presentation)<br/>
&#x2714; <b>(brief review) All other presentations</b><br/>
</div>

## Preceding Step - import modules (packages)
This step is necessary in order to use external modules (packages). <br/>

In [166]:
# --------------------------------------
import pandas as pd
import numpy as np
# --------------------------------------


# --------------------------------------
# ------------- visualizations:
import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
# --------------------------------------


# ---------------------------------------
import sklearn
from sklearn import preprocessing, metrics, pipeline, model_selection, feature_extraction 
from sklearn import naive_bayes, linear_model, svm, neural_network, neighbors, tree
from sklearn import decomposition, cluster
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split, cross_val_score, GridSearchCV 
from sklearn.pipeline import Pipeline
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
from sklearn.metrics import precision_score, recall_score, f1_score
from sklearn.metrics import mean_squared_error, r2_score, silhouette_score
from sklearn.preprocessing import MinMaxScaler, StandardScaler, LabelEncoder

from sklearn.svm import LinearSVC
from sklearn.neural_network import MLPClassifier
from sklearn.linear_model import Perceptron, SGDClassifier
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans
from sklearn.naive_bayes import MultinomialNB, GaussianNB
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
# ---------------------------------------


# ----------------- output and visualizations: 
import warnings
from sklearn.exceptions import ConvergenceWarning
warnings.simplefilter("ignore")
warnings.simplefilter(action='ignore', category=FutureWarning)
warnings.simplefilter("ignore", category=ConvergenceWarning)
# show several prints in one cell. This will allow us to condence every trick in one cell.
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
%matplotlib inline
pd.pandas.set_option('display.max_columns', None)
pd.set_option('display.float_format', lambda x: '%.3f' % x)
# ---------------------------------------

### Text analysis and String manipulation imports:

In [167]:
# --------------------------------------
# --------- Text analysis and Hebrew text analysis imports:
# vectorizers:
from sklearn.feature_extraction import text
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer

# regular expressions:
import re
# --------------------------------------

### (optional) Hebrew text analysis - WordNet (for Hebrew)
Note: the WordNet is not a must

#### (optional) Only if you didn't install Wordnet (for Hebrew) use:

In [168]:
# word net installation:

# unmark if you want to use and need to install
# !pip install wn
# !python -m wn download omw-he:1.4

In [169]:
# word net import:

# unmark if you want to use:
# import wn

### (optional) Hebrew text analysis - hebrew_tokenizer (Tokenizer for Hebrew)
Note: the hebrew_tokenizer is not a must

#### (optional) Only if you didn't install hebrew_tokenizer use:

In [170]:
# Hebrew tokenizer installation:

# unmark if you want to use and need to install:
#!pip install hebrew_tokenizer

In [171]:
# Hebrew tokenizer import:

# unmark if you want to use:
import hebrew_tokenizer as ht

In [172]:
text = 'כשחבר הזמין אותי לחול, לא באמת חשבתי שזה יקרה' #test

In [173]:
list(ht.tokenize(text))#test

[('HEBREW', 'כשחבר', 0, (0, 5)),
 ('HEBREW', 'הזמין', 1, (6, 11)),
 ('HEBREW', 'אותי', 2, (12, 16)),
 ('HEBREW', 'לחול', 3, (17, 21)),
 ('PUNCTUATION', ',', 4, (21, 22)),
 ('HEBREW', 'לא', 5, (23, 25)),
 ('HEBREW', 'באמת', 6, (26, 30)),
 ('HEBREW', 'חשבתי', 7, (31, 36)),
 ('HEBREW', 'שזה', 8, (37, 40)),
 ('HEBREW', 'יקרה', 9, (41, 45))]

### Reading input files
Reading input files for train annotated corpus (raw text data) corpus and for the test corpus

In [174]:
train_filename = 'annotated_corpus_for_train.csv'
test_filename  = 'corpus_for_test.csv'
df_train = pd.read_csv(train_filename, index_col=None, encoding='utf-8')
df_test  = pd.read_csv(test_filename, index_col=None, encoding='utf-8')

In [175]:
df_train.head(8)
df_train.shape

Unnamed: 0,story,gender
0,"כשחבר הזמין אותי לחול, לא באמת חשבתי שזה יקרה,...",m
1,לפני שהתגייסתי לצבא עשיתי כל מני מיונים ליחידו...,m
2,מאז שהתחילו הלימודים חלומו של כל סטודנט זה הפנ...,f
3,"כשהייתי ילד, מטוסים היה הדבר שהכי ריתק אותי. ב...",m
4,‏הייתי מדריכה בכפר נוער ומתאם הכפר היינו צריכי...,f
5,לפני כ3 חודשים טסתי לרומא למשך שבוע. טסתי במטו...,f
6,אני כבר שנתיים נשוי והשנה אני ואישתי סוף סוף י...,m
7,השנה התחלנו שיפוץ בדירה שלנו בתל אביב. הדירה ה...,f


(753, 2)

In [176]:
df_test.head(3)
df_test.shape

Unnamed: 0,test_example_id,story
0,0,כל קיץ אני והמשפחה נוסעים לארצות הברית לוס אנג...
1,1,"הגעתי לשירות המדינה אחרי שנתיים כפעיל בתנועת ""..."
2,2,אחת האהבות הגדולות שלי אלו הכלבים שלי ושל אישת...


(323, 2)

### Your implementation:
Write your code solution in the following code-cells

In [177]:
def clean_df(df_data): 
    for indx in df_data.index:
        df_data["story"][indx] = re.sub(r'\s+', ' ', df_data["story"][indx])
        df_data["story"][indx] = df_data["story"][indx].strip()
        df_data["story"][indx] = re.sub(r'\d+', '', df_data["story"][indx])
        df_data["story"][indx] = re.sub(r'[^\w\s]', '', df_data["story"][indx])
    return df_data

In [178]:
df_train = clean_df(df_train) 
df_test = clean_df(df_test)

In [179]:
df_train.head()

Unnamed: 0,story,gender
0,כשחבר הזמין אותי לחול לא באמת חשבתי שזה יקרה פ...,m
1,לפני שהתגייסתי לצבא עשיתי כל מני מיונים ליחידו...,m
2,מאז שהתחילו הלימודים חלומו של כל סטודנט זה הפנ...,f
3,כשהייתי ילד מטוסים היה הדבר שהכי ריתק אותי בתו...,m
4,הייתי מדריכה בכפר נוער ומתאם הכפר היינו צריכים...,f


## convert labels to numbers

In [180]:
label_encoder  = LabelEncoder()
df_train['gender'] = label_encoder.fit_transform(df_train['gender'])

In [181]:
df_train.head()

Unnamed: 0,story,gender
0,כשחבר הזמין אותי לחול לא באמת חשבתי שזה יקרה פ...,1
1,לפני שהתגייסתי לצבא עשיתי כל מני מיונים ליחידו...,1
2,מאז שהתחילו הלימודים חלומו של כל סטודנט זה הפנ...,0
3,כשהייתי ילד מטוסים היה הדבר שהכי ריתק אותי בתו...,1
4,הייתי מדריכה בכפר נוער ומתאם הכפר היינו צריכים...,0


In [182]:
vec=CountVectorizer(max_features=1000)
X = vec.fit_transform(df_train['story'])

In [183]:
# YOUR CODE HERE
feature_name = vec.get_feature_names()

In [184]:
X_test_data = vec.transform(df_test['story'])
X_test_data = pd.DataFrame(X_test_data.toarray(),columns=feature_name)

In [185]:
X_test_data.head()

Unnamed: 0,אבא,אביב,אבל,אדם,או,אוהב,אוהבים,אוהבת,אוויר,אוכל,אולי,אומר,אותה,אותו,אותי,אותם,אותנו,אז,אחד,אחותי,אחי,אחר,אחרי,אחרים,אחרת,אחת,אי,איזה,איך,אין,איפה,איש,אישתי,איתה,איתו,איתי,איתם,איתנו,אך,אכלנו,אכן,אל,אלא,אלו,אלי,אליה,אליהם,אליו,אליי,אלינו,אם,אמא,אמור,אמר,אמרה,אמרו,אמרתי,אנחנו,אני,אנשים,אף,אפילו,אפשר,אצל,אצלי,ארוחת,ארוך,ארוכה,אשר,אשתי,את,אתה,בא,באופן,באותה,באותו,באזור,באחד,באיזור,באילת,באינטרנט,באמצע,באמת,בארץ,בבוקר,בבית,בגדים,בגלל,בדיוק,בדיקה,בדיקות,בדיקת,בדירה,בדרך,בה,בהם,בהתחלה,בו,בוקר,בזה,בזמן,בחברה,בחדר,בחודש,בחוף,בחוץ,בחור,בחזרה,בחיי,בחיים,בחרתי,בטוח,בי,ביום,ביותר,ביחד,בין,בישראל,בית,בכדי,בכל,בכלל,בלי,בלילה,בלימודים,בלתי,במהירות,במהלך,במטרה,במיוחד,במלון,במסעדה,במצב,במקום,במקרה,במרכז,במשך,בן,בנוסף,בני,בנסיעה,בסדר,בסוף,בסופו,בעבודה,בעולם,בעזרת,בעיה,בעיקר,בעיר,בעל,בעלי,בעצם,בעקבות,בערב,בערך,בפעם,בצבא,בצד,בצורה,בקשר,בראש,ברגל,ברגע,ברור,ברכב,ברמה,בשבוע,בשביל,בשבילי,בשוק,בשלב,בשם,בשנה,בשעה,בשעות,בת,בתוך,בתור,בתחום,בתחילת,בתל,בתקופה,בתקופת,גבוה,גבוהה,גדול,גדולה,גילינו,גיליתי,גם,גר,גרם,דבר,דברים,דגים,די,דיברנו,דירה,דק,דקות,דרך,האוטו,האוכל,האחרון,האחרונה,האירוע,האלה,האם,האמת,האנשים,האפליקציה,הבא,הבאה,הבוקר,הבחור,הבייתה,הבית,הביתה,הבן,הבנו,הבנתי,הבעיה,הברית,הגדול,הגיע,הגיעה,הגיעו,הגענו,הגעתי,הדבר,הדברים,הדירה,הדלת,הדרך,ההורים,הוא,הודעה,הולדת,הולך,הזאת,הזה,הזו,הזוג,הזמן,הזמנו,החבר,החברה,החברים,החדר,החדשה,החוויה,החולים,החוף,החופשה,החיים,החל,החלטנו,החלטתי,החליט,החתונה,הטיול,הטיסה,הטלפון,היא,היה,היו,היום,היינו,הייתה,הייתי,הילד,הילדים,הים,הימים,היתה,הכביש,הכי,הכל,הכלב,הכניסה,הכסף,הכרתי,הליכה,הלילה,הלימודים,הלך,הלכנו,הלכתי,הם,המבחן,המבחנים,המדינה,המון,המחשב,המים,המלון,המסלול,המצב,המקום,המקרה,המשחק,המשטרה,המשכנו,המשכתי,המשפחה,הן,הנסיעה,הסגר,הסגרים,הסיפור,הספר,העבודה,העולם,העיר,הערב,הפעם,הצהריים,הצוות,הציוד,הצלחנו,הצלחתי,הקבוצה,הקורונה,הקורס,הקטן,הראש,הראשון,הראשונה,הרבה,הרגיש,הרגע,הרגשתי,הרי,הרכב,השבוע,השלישי,השנה,השני,השנייה,השעה,התהליך,התואר,התחיל,התחילה,התחילו,התחלנו,התחלתי,התעופה,התקופה,התקשרתי,ואז,ואחרי,ואיך,ואין,ואכן,ואם,ואמר,ואמרתי,ואנחנו,ואני,ואף,ואפילו,ואת,ובנוסף,ובסוף,ובת,וגם,והגענו,והוא,והחלטנו,והחלטתי,והיא,והיה,והיו,והיינו,והייתי,והכל,והלכנו,והם,והמשכנו,והתחלנו,והתחלתי,וזה,וחצי,ויותר,ויצאנו,ויצאתי,ויש,וישר,וכבר,וכו,וכולם,וכך,וכל,וכמה,וכמובן,ולא,ולאחר,ולבסוף,ולכן,ומה,ומיד,ומשם,ונסענו,ועד,ועוד,ועל,ועם,ופשוט,ופתאום,וראיתי,ורק,ושוב,ושם,זאת,זה,זו,זוג,זוגתי,זוכר,זמן,חבר,חברה,חברות,חברי,חברים,חדר,חדש,חדשה,חדשים,חודש,חודשים,חוויה,חוויות,חולים,חוסר,חופש,חוץ,חושב,חזק,חזר,חזרה,חזרנו,חזרתי,חיי,חייב,חיים,חיכינו,חלק,חם,חמישי,חצי,חשבנו,חשבתי,חשוב,טוב,טובה,טובים,טיול,טיסה,טיפה,טלפון,טסתי,טעים,ידי,ידע,ידענו,ידעתי,יהיה,יודע,יודעת,יום,יומיים,יוצא,יותר,יחד,יחסית,יכול,יכולה,יכולים,יכולנו,יכולתי,ילד,ילדים,ים,ימי,ימים,יפה,יצא,יצאה,יצאנו,יצאתי,יקרה,יש,ישבנו,ישר,ישראל,כאילו,כאלה,כאן,כאשר,כבר,כדורגל,כדי,כולם,כולנו,כזאת,כזה,כזו,כחצי,כי,כיוון,כיף,כך,ככה,ככל,כל,כלב,כלבים,כלום,כלומר,כלל,כמה,כמו,כמובן,כמעט,כן,כנראה,כסף,כעבור,כרטיסים,כשהגענו,כשהייתי,לא,לאורך,לאחר,לאט,לאילת,לאכול,לאן,לאסוף,לארץ,לב,לבד,לבדוק,לבחור,לבידוד,לבית,לבן,לבסוף,לבצע,לבקר,לגבי,לגור,לגלות,לגמרי,לדבר,לדירה,לדעת,לדרך,לה,להביא,להבין,להגיד,להגיע,להוציא,להיות,להיכנס,להיפגש,להישאר,להכין,להכיר,להם,להמשיך,להסתובב,להעביר,להרגיש,להתחיל,להתמודד,לו,לוותר,לזה,לחברה,לחגוג,לחדר,לחוף,לחופשה,לחזור,לחכות,לחפש,לחשוב,לטוס,לטיול,לטייל,לטיסה,לי,ליד,ליום,לילה,לים,לישון,לישראל,לך,לכולם,לכיוון,לכך,לכל,לכמה,לכן,לכתוב,ללא,ללימודים,ללכת,ללמוד,למדתי,למה,למזלי,למחרת,למטוס,למים,למלון,למצב,למצוא,למקום,למרות,למשך,לנו,לנוח,לנסוע,לנסות,לסגור,לספר,לעבוד,לעבודה,לעבור,לעוד,לעולם,לעזור,לעיר,לעלות,לעמוד,לעצור,לעצמי,לעשות,לפחות,לפי,לפנות,לפני,לפתע,לצאת,לצד,לציין,לצערי,לקבל,לקום,לקח,לקחנו,לקחת,לקחתי,לקנות,לקראת,לקרוא,לראות,לרגע,לרדת,לרוץ,לרכב,לשאול,לשבת,לשדה,לשחק,לשים,לשלם,לשם,לשמוע,לשמור,לתוך,לתת,מאד,מאוד,מאוחר,מאוחרת,מאז,מאחורי,מאיתנו,מבחינת,מבחן,מבין,מבית,מבלי,מגיע,מגיעים,מדהים,מדהימה,מדובר,מדי,מה,מהבית,מהם,מהעבודה,מהצבא,מהר,מוזר,מוכן,מול,מונית,מוקדם,מושלם,מזה,מזל,מחוץ,מחכה,מטורף,מטר,מטרים,מי,מיד,מידי,מיוחד,מים,מיני,מישהו,מכיוון,מכיר,מכל,מכן,מלא,מלאה,מלון,ממה,ממנה,ממנו,ממני,ממש,מן,מנסה,מנת,מסלול,מספיק,מספר,מסתבר,מעבר,מעולה,מעט,מעל,מפה,מצאנו,מצאתי,מצב,מצד,מצליח,מקווה,מקום,מקומות,מקרה,מראש,מרגיש,מרוב,משהו,משום,משחק,משם,משנה,משפחה,משפחתי,מתוך,מתחיל,מתחת,מתי,נגמר,נוסף,נוספים,נוספת,נורא,נחמד,ניסיון,ניסינו,ניסיתי,ניתן,נכון,נכנס,נכנסו,נכנסנו,נכנסתי,נמצא,נסיעה,נסענו,נסעתי,נעים,נראה,נשאר,נשימה,נתן,סביב,סגור,סגר,סוף,סטודנט,סיכוי,סיפור,ספק,ספר,סתם,עבדתי,עבודה,עבור,עבורי,עבר,עברה,עברו,עברנו,עברתי,עד,עדיין,עובד,עובדים,עובר,עוד,עולה,עושה,עושים,עזרה,עכשיו,על,עלה,עלי,עליה,עליהם,עליו,עליי,עלינו,עליתי,עם,עניין,ענק,עצמה,עצמו,עצמי,עצרנו,עקב,ערב,עשה,עשו,עשינו,עשיתי,פה,פחדתי,פחות,פעם,פעמים,פשוט,פתאום,צוות,צורך,ציוד,צלילה,צריך,צריכה,צריכים,קודם,קורה,קורונה,קורס,קטן,קטנה,קטנים,קיבלנו,קיבלתי,קל,קמנו,קמתי,קניות,קנינו,קניתי,קפה,קצר,קצרה,קצת,קרה,קרוב,קשה,קשר,ראינו,ראיתי,ראש,ראשון,ראשונה,רב,רבה,רבות,רבים,רגיל,רגע,רגעים,רואה,רוב,רוצה,רוצים,רחוק,רכב,רעש,רצינו,רציתי,רק,שאוכל,שאולי,שאין,שאלות,שאלתי,שאם,שאנחנו,שאני,שאף,שאפשר,שאר,שאתה,שבה,שבו,שבוע,שבועות,שבועיים,שבת,שגם,שהגיע,שהגענו,שהגעתי,שהוא,שהיא,שהיה,שהיו,שהיינו,שהייתה,שהייתי,שהכל,שהם,שוב,שווה,שום,שונה,שונות,שונים,שזה,שזו,שיהיה,שיותר,שיחה,שיש,שישי,שכבר,שכולם,שכל,שכן,של,שלא,שלג,שלה,שלהם,שלו,שלוש,שלושה,שלי,שלך,שלם,שלנו,שם,שמה,שמח,שמעתי,שמתי,שנה,שנוכל,שני,שנים,שנינו,שנמצא,שנקרא,שנת,שנתיים,שעבר,שעברה,שעברתי,שעה,שעות,שעשיתי,שעתיים,שצריך,שקיבלתי,שקלים,שקרה,שראיתי,שרציתי,שרק,שתי,תהיה,תוך,תל,תמיד,תקופה,תקופת
0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,1,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,3,0,0,1,2,0,0,0,1,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,1,1,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,4,0,2,0,2,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,1,0,1,0,0,1,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,3,0,1,0,0,2,0,0,2,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,1,0,0,0,1,4,0,2,0,0,0,2,0,0,1,0,0,1,0,0,7,0,0,0,0,0,2,0,0,0,1,0,0,0,0,0,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,1,0,0,0,0,0,0,0,0,1,1,0,0,0,1,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,2,0,1,0,0,0,0,0,1,0,0,0,0,2,0,1,0,0,0,0,0,0,0,0,0,0,4,0,0,0,0,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,1,0,0,0,0,0,0,1,0,1,0,0,0,0,0,1,0,0,0,0,0,0,2,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,4,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,2,0,0,2,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,3,0,0,1,4,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0
1,0,0,3,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,5,0,0,0,3,0,0,1,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,3,5,0,0,0,0,0,0,0,0,0,0,6,2,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,1,0,0,0,2,1,0,1,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,6,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,2,0,0,0,0,0,4,2,0,0,0,3,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,2,0,0,0,1,0,0,7,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,5,0,0,1,0,0,0,0,0,2,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,8,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,2,0,0,1,0,0,0,0,0,0,0,0,1,0,1,0,0,1,0,0,1,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,6,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,1,0,0,0,1,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,1,0,0,4,0,0,0,0,0,0,0,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,2,0,0,0,1,0,0,0,9,0,0,0,0,1,0,0,2,2,1,1,3,0,0,0,0,1,0,2,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,2,0,0
2,0,0,1,0,0,0,0,0,0,0,0,0,2,0,0,1,2,0,1,0,0,0,0,0,0,1,0,0,0,1,0,0,2,0,0,0,0,0,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,0,3,3,1,0,0,0,0,0,0,0,0,0,0,6,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,2,0,0,0,0,1,0,0,0,0,0,0,2,3,0,0,0,1,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,4,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,1,1,0,0,0,0,0,4,0,0,1,0,1,4,0,1,0,0,0,2,2,0,1,0,0,1,0,0,1,0,5,0,1,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,1,0,0,0,0,0,3,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,1,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,2,0,0,0,0,0,0,3,0,0,2,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,2,0,0,2,1,0,0,0,3,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0
3,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,1,5,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,1,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,2,0,0,0,0,0,0,0,0,0,0,9,0,0,0,1,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,2,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,3,4,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,1,1,0,0,2,0,0,1,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,4,0,0,2,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,5,0,0,0,1,0,0,1,0,0,2,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,3,0,2,0,1,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,3,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,1,0,0,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,1,0,0,0,0,1,0,0,0,0,0,4,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,6,1,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,1,0
4,0,0,3,0,0,0,0,0,0,0,0,0,0,1,0,0,2,2,1,0,0,0,1,1,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,1,6,0,1,0,4,0,0,0,0,0,0,0,10,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,0,0,2,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,2,0,0,1,1,0,0,0,0,0,2,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,2,0,0,0,0,0,0,3,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,1,0,0,0,0,0,0,0,0,0,0,0,6,1,0,0,0,0,2,0,0,0,2,0,1,0,0,0,1,0,0,0,0,0,4,0,0,0,1,0,0,0,0,0,0,0,0,0,0,3,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,1,0,2,0,2,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,3,0,1,0,0,0,0,0,1,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,1,0,1,0,0,0,2,0,0,2,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,2,0,0,0,1,0,9,0,2,0,0,0,0,0,0,2,0,3,1,0,1,0,0,2,0,0,2,0,0,9,0,0,0,0,0,1,1,0,0,0,0,0,0,1,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,12,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,1,2,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,2,0,0,1,0,0,2,0,0,0,0,0,2,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,2,0,0,0,0,2,8,0,0,0,0,1,0,0,0,6,0,0,0,0,0,0,0,0,0,1,0,2,2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,1,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,2,0,0,0,0,0,0,0,0,0,3,1,0,0,1,0,22,2,0,0,0,2,0,0,1,0,0,1,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [186]:
feature_name

['אבא',
 'אביב',
 'אבל',
 'אדם',
 'או',
 'אוהב',
 'אוהבים',
 'אוהבת',
 'אוויר',
 'אוכל',
 'אולי',
 'אומר',
 'אותה',
 'אותו',
 'אותי',
 'אותם',
 'אותנו',
 'אז',
 'אחד',
 'אחותי',
 'אחי',
 'אחר',
 'אחרי',
 'אחרים',
 'אחרת',
 'אחת',
 'אי',
 'איזה',
 'איך',
 'אין',
 'איפה',
 'איש',
 'אישתי',
 'איתה',
 'איתו',
 'איתי',
 'איתם',
 'איתנו',
 'אך',
 'אכלנו',
 'אכן',
 'אל',
 'אלא',
 'אלו',
 'אלי',
 'אליה',
 'אליהם',
 'אליו',
 'אליי',
 'אלינו',
 'אם',
 'אמא',
 'אמור',
 'אמר',
 'אמרה',
 'אמרו',
 'אמרתי',
 'אנחנו',
 'אני',
 'אנשים',
 'אף',
 'אפילו',
 'אפשר',
 'אצל',
 'אצלי',
 'ארוחת',
 'ארוך',
 'ארוכה',
 'אשר',
 'אשתי',
 'את',
 'אתה',
 'בא',
 'באופן',
 'באותה',
 'באותו',
 'באזור',
 'באחד',
 'באיזור',
 'באילת',
 'באינטרנט',
 'באמצע',
 'באמת',
 'בארץ',
 'בבוקר',
 'בבית',
 'בגדים',
 'בגלל',
 'בדיוק',
 'בדיקה',
 'בדיקות',
 'בדיקת',
 'בדירה',
 'בדרך',
 'בה',
 'בהם',
 'בהתחלה',
 'בו',
 'בוקר',
 'בזה',
 'בזמן',
 'בחברה',
 'בחדר',
 'בחודש',
 'בחוף',
 'בחוץ',
 'בחור',
 'בחזרה',
 'בחיי',
 'בחיים',
 'בחרתי',


In [187]:
len(feature_name)

1000

In [188]:
X = pd.DataFrame(X.toarray(),columns=feature_name)

In [189]:
X.head()

Unnamed: 0,אבא,אביב,אבל,אדם,או,אוהב,אוהבים,אוהבת,אוויר,אוכל,אולי,אומר,אותה,אותו,אותי,אותם,אותנו,אז,אחד,אחותי,אחי,אחר,אחרי,אחרים,אחרת,אחת,אי,איזה,איך,אין,איפה,איש,אישתי,איתה,איתו,איתי,איתם,איתנו,אך,אכלנו,אכן,אל,אלא,אלו,אלי,אליה,אליהם,אליו,אליי,אלינו,אם,אמא,אמור,אמר,אמרה,אמרו,אמרתי,אנחנו,אני,אנשים,אף,אפילו,אפשר,אצל,אצלי,ארוחת,ארוך,ארוכה,אשר,אשתי,את,אתה,בא,באופן,באותה,באותו,באזור,באחד,באיזור,באילת,באינטרנט,באמצע,באמת,בארץ,בבוקר,בבית,בגדים,בגלל,בדיוק,בדיקה,בדיקות,בדיקת,בדירה,בדרך,בה,בהם,בהתחלה,בו,בוקר,בזה,בזמן,בחברה,בחדר,בחודש,בחוף,בחוץ,בחור,בחזרה,בחיי,בחיים,בחרתי,בטוח,בי,ביום,ביותר,ביחד,בין,בישראל,בית,בכדי,בכל,בכלל,בלי,בלילה,בלימודים,בלתי,במהירות,במהלך,במטרה,במיוחד,במלון,במסעדה,במצב,במקום,במקרה,במרכז,במשך,בן,בנוסף,בני,בנסיעה,בסדר,בסוף,בסופו,בעבודה,בעולם,בעזרת,בעיה,בעיקר,בעיר,בעל,בעלי,בעצם,בעקבות,בערב,בערך,בפעם,בצבא,בצד,בצורה,בקשר,בראש,ברגל,ברגע,ברור,ברכב,ברמה,בשבוע,בשביל,בשבילי,בשוק,בשלב,בשם,בשנה,בשעה,בשעות,בת,בתוך,בתור,בתחום,בתחילת,בתל,בתקופה,בתקופת,גבוה,גבוהה,גדול,גדולה,גילינו,גיליתי,גם,גר,גרם,דבר,דברים,דגים,די,דיברנו,דירה,דק,דקות,דרך,האוטו,האוכל,האחרון,האחרונה,האירוע,האלה,האם,האמת,האנשים,האפליקציה,הבא,הבאה,הבוקר,הבחור,הבייתה,הבית,הביתה,הבן,הבנו,הבנתי,הבעיה,הברית,הגדול,הגיע,הגיעה,הגיעו,הגענו,הגעתי,הדבר,הדברים,הדירה,הדלת,הדרך,ההורים,הוא,הודעה,הולדת,הולך,הזאת,הזה,הזו,הזוג,הזמן,הזמנו,החבר,החברה,החברים,החדר,החדשה,החוויה,החולים,החוף,החופשה,החיים,החל,החלטנו,החלטתי,החליט,החתונה,הטיול,הטיסה,הטלפון,היא,היה,היו,היום,היינו,הייתה,הייתי,הילד,הילדים,הים,הימים,היתה,הכביש,הכי,הכל,הכלב,הכניסה,הכסף,הכרתי,הליכה,הלילה,הלימודים,הלך,הלכנו,הלכתי,הם,המבחן,המבחנים,המדינה,המון,המחשב,המים,המלון,המסלול,המצב,המקום,המקרה,המשחק,המשטרה,המשכנו,המשכתי,המשפחה,הן,הנסיעה,הסגר,הסגרים,הסיפור,הספר,העבודה,העולם,העיר,הערב,הפעם,הצהריים,הצוות,הציוד,הצלחנו,הצלחתי,הקבוצה,הקורונה,הקורס,הקטן,הראש,הראשון,הראשונה,הרבה,הרגיש,הרגע,הרגשתי,הרי,הרכב,השבוע,השלישי,השנה,השני,השנייה,השעה,התהליך,התואר,התחיל,התחילה,התחילו,התחלנו,התחלתי,התעופה,התקופה,התקשרתי,ואז,ואחרי,ואיך,ואין,ואכן,ואם,ואמר,ואמרתי,ואנחנו,ואני,ואף,ואפילו,ואת,ובנוסף,ובסוף,ובת,וגם,והגענו,והוא,והחלטנו,והחלטתי,והיא,והיה,והיו,והיינו,והייתי,והכל,והלכנו,והם,והמשכנו,והתחלנו,והתחלתי,וזה,וחצי,ויותר,ויצאנו,ויצאתי,ויש,וישר,וכבר,וכו,וכולם,וכך,וכל,וכמה,וכמובן,ולא,ולאחר,ולבסוף,ולכן,ומה,ומיד,ומשם,ונסענו,ועד,ועוד,ועל,ועם,ופשוט,ופתאום,וראיתי,ורק,ושוב,ושם,זאת,זה,זו,זוג,זוגתי,זוכר,זמן,חבר,חברה,חברות,חברי,חברים,חדר,חדש,חדשה,חדשים,חודש,חודשים,חוויה,חוויות,חולים,חוסר,חופש,חוץ,חושב,חזק,חזר,חזרה,חזרנו,חזרתי,חיי,חייב,חיים,חיכינו,חלק,חם,חמישי,חצי,חשבנו,חשבתי,חשוב,טוב,טובה,טובים,טיול,טיסה,טיפה,טלפון,טסתי,טעים,ידי,ידע,ידענו,ידעתי,יהיה,יודע,יודעת,יום,יומיים,יוצא,יותר,יחד,יחסית,יכול,יכולה,יכולים,יכולנו,יכולתי,ילד,ילדים,ים,ימי,ימים,יפה,יצא,יצאה,יצאנו,יצאתי,יקרה,יש,ישבנו,ישר,ישראל,כאילו,כאלה,כאן,כאשר,כבר,כדורגל,כדי,כולם,כולנו,כזאת,כזה,כזו,כחצי,כי,כיוון,כיף,כך,ככה,ככל,כל,כלב,כלבים,כלום,כלומר,כלל,כמה,כמו,כמובן,כמעט,כן,כנראה,כסף,כעבור,כרטיסים,כשהגענו,כשהייתי,לא,לאורך,לאחר,לאט,לאילת,לאכול,לאן,לאסוף,לארץ,לב,לבד,לבדוק,לבחור,לבידוד,לבית,לבן,לבסוף,לבצע,לבקר,לגבי,לגור,לגלות,לגמרי,לדבר,לדירה,לדעת,לדרך,לה,להביא,להבין,להגיד,להגיע,להוציא,להיות,להיכנס,להיפגש,להישאר,להכין,להכיר,להם,להמשיך,להסתובב,להעביר,להרגיש,להתחיל,להתמודד,לו,לוותר,לזה,לחברה,לחגוג,לחדר,לחוף,לחופשה,לחזור,לחכות,לחפש,לחשוב,לטוס,לטיול,לטייל,לטיסה,לי,ליד,ליום,לילה,לים,לישון,לישראל,לך,לכולם,לכיוון,לכך,לכל,לכמה,לכן,לכתוב,ללא,ללימודים,ללכת,ללמוד,למדתי,למה,למזלי,למחרת,למטוס,למים,למלון,למצב,למצוא,למקום,למרות,למשך,לנו,לנוח,לנסוע,לנסות,לסגור,לספר,לעבוד,לעבודה,לעבור,לעוד,לעולם,לעזור,לעיר,לעלות,לעמוד,לעצור,לעצמי,לעשות,לפחות,לפי,לפנות,לפני,לפתע,לצאת,לצד,לציין,לצערי,לקבל,לקום,לקח,לקחנו,לקחת,לקחתי,לקנות,לקראת,לקרוא,לראות,לרגע,לרדת,לרוץ,לרכב,לשאול,לשבת,לשדה,לשחק,לשים,לשלם,לשם,לשמוע,לשמור,לתוך,לתת,מאד,מאוד,מאוחר,מאוחרת,מאז,מאחורי,מאיתנו,מבחינת,מבחן,מבין,מבית,מבלי,מגיע,מגיעים,מדהים,מדהימה,מדובר,מדי,מה,מהבית,מהם,מהעבודה,מהצבא,מהר,מוזר,מוכן,מול,מונית,מוקדם,מושלם,מזה,מזל,מחוץ,מחכה,מטורף,מטר,מטרים,מי,מיד,מידי,מיוחד,מים,מיני,מישהו,מכיוון,מכיר,מכל,מכן,מלא,מלאה,מלון,ממה,ממנה,ממנו,ממני,ממש,מן,מנסה,מנת,מסלול,מספיק,מספר,מסתבר,מעבר,מעולה,מעט,מעל,מפה,מצאנו,מצאתי,מצב,מצד,מצליח,מקווה,מקום,מקומות,מקרה,מראש,מרגיש,מרוב,משהו,משום,משחק,משם,משנה,משפחה,משפחתי,מתוך,מתחיל,מתחת,מתי,נגמר,נוסף,נוספים,נוספת,נורא,נחמד,ניסיון,ניסינו,ניסיתי,ניתן,נכון,נכנס,נכנסו,נכנסנו,נכנסתי,נמצא,נסיעה,נסענו,נסעתי,נעים,נראה,נשאר,נשימה,נתן,סביב,סגור,סגר,סוף,סטודנט,סיכוי,סיפור,ספק,ספר,סתם,עבדתי,עבודה,עבור,עבורי,עבר,עברה,עברו,עברנו,עברתי,עד,עדיין,עובד,עובדים,עובר,עוד,עולה,עושה,עושים,עזרה,עכשיו,על,עלה,עלי,עליה,עליהם,עליו,עליי,עלינו,עליתי,עם,עניין,ענק,עצמה,עצמו,עצמי,עצרנו,עקב,ערב,עשה,עשו,עשינו,עשיתי,פה,פחדתי,פחות,פעם,פעמים,פשוט,פתאום,צוות,צורך,ציוד,צלילה,צריך,צריכה,צריכים,קודם,קורה,קורונה,קורס,קטן,קטנה,קטנים,קיבלנו,קיבלתי,קל,קמנו,קמתי,קניות,קנינו,קניתי,קפה,קצר,קצרה,קצת,קרה,קרוב,קשה,קשר,ראינו,ראיתי,ראש,ראשון,ראשונה,רב,רבה,רבות,רבים,רגיל,רגע,רגעים,רואה,רוב,רוצה,רוצים,רחוק,רכב,רעש,רצינו,רציתי,רק,שאוכל,שאולי,שאין,שאלות,שאלתי,שאם,שאנחנו,שאני,שאף,שאפשר,שאר,שאתה,שבה,שבו,שבוע,שבועות,שבועיים,שבת,שגם,שהגיע,שהגענו,שהגעתי,שהוא,שהיא,שהיה,שהיו,שהיינו,שהייתה,שהייתי,שהכל,שהם,שוב,שווה,שום,שונה,שונות,שונים,שזה,שזו,שיהיה,שיותר,שיחה,שיש,שישי,שכבר,שכולם,שכל,שכן,של,שלא,שלג,שלה,שלהם,שלו,שלוש,שלושה,שלי,שלך,שלם,שלנו,שם,שמה,שמח,שמעתי,שמתי,שנה,שנוכל,שני,שנים,שנינו,שנמצא,שנקרא,שנת,שנתיים,שעבר,שעברה,שעברתי,שעה,שעות,שעשיתי,שעתיים,שצריך,שקיבלתי,שקלים,שקרה,שראיתי,שרציתי,שרק,שתי,תהיה,תוך,תל,תמיד,תקופה,תקופת
0,0,0,4,0,0,0,0,0,0,0,0,1,0,1,1,0,0,1,3,0,0,0,1,0,0,1,0,0,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,0,0,0,0,0,2,2,4,0,0,1,0,0,0,0,0,0,0,0,8,0,0,0,0,0,1,0,0,0,0,0,1,0,1,0,1,0,0,0,0,0,0,0,0,1,3,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,1,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,2,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,2,0,0,0,0,0,0,1,0,0,0,0,0,3,0,6,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,5,1,0,0,4,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,2,0,0,0,0,0,5,0,0,0,0,0,1,0,0,0,1,0,2,1,0,0,0,2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,3,1,1,0,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,1,0,0,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,2,1,0,0,0,0,2,0,0,1,0,0,3,0,1,0,1,0,1,0,0,1,0,6,0,4,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,4,0,0,0,0,2,0,0,0,4,0,0,0,0,0,0,0,2,0,0,0,0,0,1,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,2,0,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,1,1,1,0,0,0,0,0,2,1,1,0,0,0,1,0,0,0,0,0,0,0,0,2,1,0,0,0,0,1,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,1,1,0,5,0,0,0,0,0,0,1,0,5,0,0,2,0,0,2,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,3,1,0,1,0,1,1,1,0,0,0,0,0,0,0,0,0,1,0,1,0,1,0,0,0,0,1,0,0,1,0,0,1,0,0,0,1,0,0,1,0,0,0,1,0,0,0,0,0,0,1,0,0,1,0,0,0,0,1,0,0,0,0,0,5,0,0,1,0,0,0,0,0,0,0,3,2,0,0,0,1,0,0,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,1,3,0,1,0,0,0,0,0,2,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,2,0,4,1,1,2,6,0,0,0,0,0,0,0,0,0,0,4,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,2,1,0,2,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,1,0,0,1,0,0,0,0,1,0,0,0,0,0,5,0,0,2,0,0,0,0,0,0,0,0,0,1,0,1,1,0,0,0,0,4,0,0,0,0,3,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,4,0,1,0,1,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,2,0,0,0,0,1,0,1,2,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,2,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,6,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,2,1,0,0,0,0,1,0,0,0,0,0,4,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,6,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,2,0,0,0,1,0,0,2,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,3,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,2,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,3,0,0,4,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,1,0,2,0,0,0,0,0,2,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,2,1,0,0,0,1,0,1,2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,2,0,1,0,1,0,0
2,0,0,5,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,3,0,0,1,3,0,0,1,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,3,2,2,0,1,0,0,0,2,0,0,6,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,1,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,6,0,1,0,0,0,0,0,0,0,2,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,2,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,2,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,2,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,0,0,0,0,1,0,0,2,0,1,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,1,0,1,0,0,0,0,0,0,0,3,0,1,0,0,0,0,1,0,0,1,1,0,2,0,0,0,1,0,2,0,0,0,0,0,0,0,0,0,1,7,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,1,1,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,2,0,0,0,0,0,0,2,0,0,1,1,0,0,0,0,2,1,1,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0
3,5,0,2,1,0,0,0,0,0,0,0,0,0,3,2,0,0,2,2,0,0,0,1,0,0,0,1,1,3,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,4,0,1,1,0,0,0,0,0,0,0,0,9,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,3,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,1,0,1,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,1,0,0,2,0,0,0,1,1,0,0,0,1,0,0,0,0,0,7,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,13,2,0,0,0,3,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,7,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,1,1,1,0,0,0,0,0,0,0,2,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,0,0,0,0,0,0,3,0,0,1,0,0,0,0,0,1,11,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,1,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,1,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,1,1,0,0,0,0,0,0,1,0,1,0,0,1,0,0,0,0,0,1,0,0,0,0,5,2,0,0,0,1,0,0,12,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0
4,0,0,2,0,1,0,0,0,0,0,0,0,0,0,5,1,0,0,2,0,0,0,3,0,0,2,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,1,0,0,0,0,0,0,0,6,0,0,1,0,1,1,0,0,0,0,1,0,1,0,0,0,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,3,0,0,1,0,0,0,1,0,0,3,1,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,2,0,1,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,2,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,1,0,0,1,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,2,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,1,0,0,3,0,0,0,0,0,2,0,2,0,0,0,0,1,0,0,1,4,0,0,0,0,0,0,0,3,0,2,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,4,0,0,0,1,1,0,2,6,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,1,0,0,0,0,0,1,0,0,0,0,0,1,2,0,0,1,0,0,1,2,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,1,0,0,0,0,0,0,2,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [190]:
df_train[:1]['story'][0] #test

'כשחבר הזמין אותי לחול לא באמת חשבתי שזה יקרה פשוט אמרתי לו כן ותיארתי לעצני שזה יתבטל אחרי שבועיים בערך אני מקבל טלפוןם ממנו שומע מצאתי אחלה מקודות שנוטכל טייל בהם ואז הבנתי שזה הולך לקרות התחלתי להתארגןם על דברים ציוד להליכה תיקים בגדים חמים כסף ודרכון מעודכן לאחר תכנונים נפגשנו בשדה הוא הביא לי את אחד מהתיקים שלו כי לי אין תיק טוב לטיולים ועלינו למטוס לאיטליה בטיסה עצמה לא הצלחתי לישון היה ילד קטן שבכה כל הדרך מעצבן כשהגענו הלכנו ישר לסוכנות השכרת הרכב ולקחנו את הרכב שהזמנו מראש סיטרואל C בצבע סגול כי זה מה שנשאר חצי קראנו לה עלינו על חצי והתחלנו את המסע לכיוון אגם גארדה השעה הייתה  בערב קצת קריר בחוץ חושך מוות ואין לנו מושג לאן אנחנו נוסעים רק עם GPS בהתחלה התחלנו לחפש מקום לישון בו מצאנו עיירה סמוכה והחלטנו ללכת לשם על הדרך עצרנו בפיצה הםיצה הראשונה באיטליה משם המשכנו לעיירה עצמה ומצאנו אכסנייה די נחמדה שבה עצרנו ללילה בבוקר שלמחורת הוא מצא מסלול טיול על אחד ההרים באזור נסענו לשם נסיעה של כשעה בערך התחלנו לעלות עם הרכב לכיוון המסלול הדרך הייתה נופית עצים ויער מכל כיוון עד שבאיזשהו

In [191]:
y = df_train.gender

In [192]:
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2)

In [193]:
X.shape

(753, 1000)

In [194]:
X_train.shape

(602, 1000)

In [195]:
X_test.shape

(151, 1000)

### machine learning

In [196]:
lg = linear_model.LogisticRegression()
lg.fit(X_train,y_train)

LogisticRegression()

In [197]:
lg2 = RandomForestClassifier()
lg2.fit(X_train,y_train)

RandomForestClassifier()

In [198]:
train_pred = lg2.predict(X_train)

In [199]:
print(classification_report(y_train, train_pred)) #train 

              precision    recall  f1-score   support

           0       1.00      1.00      1.00       144
           1       1.00      1.00      1.00       458

    accuracy                           1.00       602
   macro avg       1.00      1.00      1.00       602
weighted avg       1.00      1.00      1.00       602



In [200]:
test_pred = lg.predict(X_test)
test_pred2 = lg2.predict(X_test)

In [201]:
print(classification_report(y_test, test_pred)) #test 1000

              precision    recall  f1-score   support

           0       0.55      0.53      0.54        34
           1       0.86      0.87      0.87       117

    accuracy                           0.79       151
   macro avg       0.70      0.70      0.70       151
weighted avg       0.79      0.79      0.79       151



In [202]:
print(classification_report(y_test, test_pred2)) #test 1000 tree

              precision    recall  f1-score   support

           0       1.00      0.06      0.11        34
           1       0.79      1.00      0.88       117

    accuracy                           0.79       151
   macro avg       0.89      0.53      0.50       151
weighted avg       0.83      0.79      0.71       151



In [222]:
lg = KNeighborsClassifier(n_neighbors=2)
lg.fit(X_train,y_train)
test_pred = lg.predict(X_test)
print(classification_report(y_test, test_pred))

KNeighborsClassifier(n_neighbors=2)

              precision    recall  f1-score   support

           0       0.31      0.53      0.39        34
           1       0.83      0.66      0.73       117

    accuracy                           0.63       151
   macro avg       0.57      0.59      0.56       151
weighted avg       0.71      0.63      0.66       151



In [223]:
lg = DecisionTreeClassifier()
lg.fit(X_train,y_train)
test_pred = lg.predict(X_test)
print(classification_report(y_test, test_pred))

DecisionTreeClassifier()

              precision    recall  f1-score   support

           0       0.32      0.26      0.29        34
           1       0.80      0.84      0.82       117

    accuracy                           0.71       151
   macro avg       0.56      0.55      0.55       151
weighted avg       0.69      0.71      0.70       151



In [224]:
model = MultinomialNB()
model.fit(X_train,y_train)
test_pred = model.predict(X_test)
print(classification_report(y_test, test_pred))

MultinomialNB()

              precision    recall  f1-score   support

           0       0.53      0.56      0.54        34
           1       0.87      0.85      0.86       117

    accuracy                           0.79       151
   macro avg       0.70      0.71      0.70       151
weighted avg       0.79      0.79      0.79       151



In [225]:
model = LinearSVC(random_state=42)
model.fit(X_train,y_train)
test_pred = model.predict(X_test)
print(classification_report(y_test, test_pred))

LinearSVC(random_state=42)

              precision    recall  f1-score   support

           0       0.48      0.47      0.48        34
           1       0.85      0.85      0.85       117

    accuracy                           0.77       151
   macro avg       0.67      0.66      0.66       151
weighted avg       0.77      0.77      0.77       151



In [226]:
model = SGDClassifier(random_state=42)
model.fit(X_train,y_train)
test_pred = model.predict(X_test)
print(classification_report(y_test, test_pred))

SGDClassifier(random_state=42)

              precision    recall  f1-score   support

           0       0.50      0.65      0.56        34
           1       0.89      0.81      0.85       117

    accuracy                           0.77       151
   macro avg       0.69      0.73      0.71       151
weighted avg       0.80      0.77      0.78       151



### predict test data

In [227]:
test_predictions = model.predict(X_test_data)

In [228]:
df_test['gender'] = test_predictions

In [229]:
df_test['gender'] = label_encoder.inverse_transform(df_test['gender'])

In [230]:
df_test.head()

Unnamed: 0,test_example_id,story,gender
0,0,כל קיץ אני והמשפחה נוסעים לארצות הברית לוס אנג...,m
1,1,הגעתי לשירות המדינה אחרי שנתיים כפעיל בתנועת י...,m
2,2,אחת האהבות הגדולות שלי אלו הכלבים שלי ושל אישת...,m
3,3,רגע הגיוס לצבא היה הרגע הכי משמעותי עבורי אני ...,m
4,4,אני הגעתי לברזיל ישר מקולומביה וגם אני עשיתי ע...,f


In [231]:
df_predicted = df_test[['test_example_id','gender']]
df_predicted.columns = ['test_example_id','predicted_category']

In [220]:
df_predicted.head()

Unnamed: 0,test_example_id,predicted_category
0,0,m
1,1,m
2,2,m
3,3,m
4,4,f


### Save output to csv (optional)
After you're done save your output to the 'classification_results.csv' csv file.<br/>
We assume that the dataframe with your results contain the following columns:
* column 1 (left column): 'test_example_id'  - the same id associated to each of the test stories to be predicted.
* column 2 (right column): 'predicted_category' - the predicted gender value for each of the associated story. 

Assuming your predicted values are in the `df_predicted` dataframe, you should save you're results as following:

In [158]:
#df_predicted.to_csv('classification_results.csv',index=False)