In [1]:
import pandas as pd
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer

## Prepare the comments and an analyzer

In [2]:
#Instantiate an analyzer
sid = SentimentIntensityAnalyzer()

In [3]:
comments_df = pd.read_csv('./comments.csv',index_col=0)

In [4]:
comments_df.reset_index(drop=True,inplace=True)

In [5]:
comments_df.head()

Unnamed: 0,article_id,comments,is_reply
0,0.0,What's the point of studying so much ended up ...,0.0
1,0.0,No matter what kind of streaming or subject ba...,0.0
2,0.0,Seems to be that the purpose of this system is...,1.0
3,0.0,This feels like just another diversion from RE...,0.0
4,0.0,Isn’t a “real” issue the boxing of kids into s...,1.0


In [6]:
comments_df.tail()

Unnamed: 0,article_id,comments,is_reply
260,3.0,Good. another own self check own self. LoL,0.0
261,3.0,Louis is 100% correct! I support his thinking ...,0.0
262,3.0,"Election tactics lah, now act like good guy so...",1.0
263,3.0,One of the many useless jiao liao be in the house,0.0
264,3.0,Louis Ng is obviously an IDIOT.,0.0


## Sample some of the comments and check their scores

In [7]:
print(comments_df['comments'][264])
print('\n',sid.polarity_scores(comments_df['comments'][264]))

Louis Ng is obviously an IDIOT.

 {'neg': 0.446, 'neu': 0.554, 'pos': 0.0, 'compound': -0.6166}


In [8]:
print(comments_df['comments'][262])
print('\n',sid.polarity_scores(comments_df['comments'][262]))

Election tactics lah, now act like good guy so people would say "Garment care for them" see what happen if they win the next election.

 {'neg': 0.0, 'neu': 0.629, 'pos': 0.371, 'compound': 0.9081}


In [9]:
print(comments_df['comments'][260])
print('\n',sid.polarity_scores(comments_df['comments'][260]))

Good. another own self check own self. LoL

 {'neg': 0.0, 'neu': 0.513, 'pos': 0.487, 'compound': 0.6908}


In [10]:
print(comments_df['comments'][4])
print('\n',sid.polarity_scores(comments_df['comments'][4]))

Isn’t a “real” issue the boxing of kids into smart vs not so smart categories?

 {'neg': 0.0, 'neu': 0.69, 'pos': 0.31, 'compound': 0.6597}


In [11]:
print(comments_df['comments'][3])
print('\n',sid.polarity_scores(comments_df['comments'][3]))

This feels like just another diversion from REAL issues. Whether streaming or SBB, the main concern should be whether our policies are maximizing our limited resources (i.e. manpower) - will our people get good jobs now and in future. Will it prevent/address the under-employment issue? Proper training or education but if those entering the workforce still cannot find careers in areas where they are academically strong in (worse if they have the passion in certain areas but no opportunity), then what is the point?

 {'neg': 0.045, 'neu': 0.797, 'pos': 0.159, 'compound': 0.8981}


In [12]:
print(comments_df['comments'][0])
print('\n',sid.polarity_scores(comments_df['comments'][0]))

What's the point of studying so much ended up working as a Cleaner and taxi driver for university graduates while the rest of the position fill up by foreign talents who work to apply for Singapore citizenship then went back to China to join People Liberation Army and dropped Singapore Citizenship

 {'neg': 0.0, 'neu': 0.872, 'pos': 0.128, 'compound': 0.7096}


<div class='alert alert-block alert-warning'>
    Results don't look promising, only the first example was picked out as a negative score, while pretty much all the comments were negative.
</div>

## Scoring all comments

In [13]:
#Create lists to append the scores to
neg = []
neu = []
pos = []
compound = []

#Iterate through each comment to retrieve the scores and append them to the respective lists
for comment in comments_df['comments']:
    neg.append(sid.polarity_scores(comment)['neg'])
    neu.append(sid.polarity_scores(comment)['neu'])
    pos.append(sid.polarity_scores(comment)['pos'])
    compound.append(sid.polarity_scores(comment)['compound'])

#Add each score as a column to the DataFrame
comments_df['neg'] = neg
comments_df['neu'] = neu
comments_df['pos'] = pos
comments_df['compound'] = compound

In [14]:
comments_df.head()

Unnamed: 0,article_id,comments,is_reply,neg,neu,pos,compound
0,0.0,What's the point of studying so much ended up ...,0.0,0.0,0.872,0.128,0.7096
1,0.0,No matter what kind of streaming or subject ba...,0.0,0.156,0.76,0.084,-0.8555
2,0.0,Seems to be that the purpose of this system is...,1.0,0.0,0.844,0.156,0.6322
3,0.0,This feels like just another diversion from RE...,0.0,0.045,0.797,0.159,0.8981
4,0.0,Isn’t a “real” issue the boxing of kids into s...,1.0,0.0,0.69,0.31,0.6597


In [15]:
#Save the comments with score
comments_df.to_csv('comments_with_score.csv')

## Checking out the top 10 positive comments

In [16]:
#Sort and look at the top 10 positive comments
comments_top10 = comments_df.sort_values('compound',ascending = False).head(10)

In [17]:
comments_top10

Unnamed: 0,article_id,comments,is_reply,neg,neu,pos,compound
82,1.0,Our elites are implementing this new system du...,0.0,0.051,0.842,0.108,0.9919
37,0.0,This is a good move because it allows individu...,0.0,0.108,0.72,0.171,0.9694
94,1.0,I can’t trust PAP government anymore. During t...,0.0,0.033,0.781,0.186,0.9568
91,1.0,Let us give credit where credit is due. Minist...,0.0,0.0,0.8,0.2,0.9565
215,3.0,"Mr. MP Keep your eye on the goal, to have a gr...",0.0,0.029,0.787,0.184,0.946
81,1.0,Good move! It allows for students to develop a...,0.0,0.0,0.736,0.264,0.9336
218,3.0,Hello MP don't just score points by not doing ...,0.0,0.0,0.853,0.147,0.928
75,1.0,Singapore and its political leadership and jud...,0.0,0.028,0.894,0.078,0.9209
262,3.0,"Election tactics lah, now act like good guy so...",1.0,0.0,0.629,0.371,0.9081
257,3.0,"At least a bright idea from a MIW, rather than...",0.0,0.075,0.791,0.135,0.9071


<div class='alert alert-block alert-warning'>
    Considering that 1 is the maximum positive score, its surprising how highly scored these comments are, especially looking at how some of the comments start ,which clearly aren't positive comments.
</div>

In [18]:
#The start sounded positive, overall it seems to lean more towards negative
print(comments_top10.loc[37]['comments'])

This is a good move because it allows individuals to shine through their respective strengths and have the opportunity to strive for improvement in their weaker subjects so as to be promoted to a higher level if they improve.

However, the ideal situation would be to allow students and parents to decide on their own learning journeys by dropping the criteria for entry to each level.

It is not a good idea to remove the PSLE though. Instead, remove the entry criteria but offer recommendations. Make it very clear that standards will not be lowered to accommodate weaker learners if they choose to take on what appears to be beyond their ability. Then let the students and parents decide. Only then can they truly explore their potential, take risks and learn to be responsible adults who understand the consequences of personal choices. People are more enthusiastic when tackling challenges they have chosen.

The only problem is, this might take a toll on ‘success’ rates. Are we prepared 

In [19]:
#Positive example correctly picked out
print(comments_top10.loc[91]['comments'])

Let us give credit where credit is due. Minister Ong has shown leadership to address an issue which has been raised for years, even by his own parliamentarians. I hope other ministers could emulate Min Ong to address other issues which have been brought up for years - say CPF issue on withdrawal age, for medical treatment etc by Mininster Teo Josephine. It is good for ministers to re-validate certain policies, and assess their relevance. Well done Minister Ong YK - hope others have belly of guts like you. You display leadership. That's what we expect of our ministers.


In [20]:
#Another positive example correctly picked out
print(comments_top10.loc[81]['comments'])

Good move! It allows for students to develop according to their own strengths and encourage more mixing of students of different aptitude and background. This prepare the students better, cause in the real world, we live and work with people of different talents, skills set and backgrounds.


<div class='alert alert-block alert-warning'>
    Some positive comments were actually picked out, but let's take a look at those that clearly aren't positive 
</div>

In [21]:
#Most positively ranked score, but starts with 'Our elites' ...
print(comments_top10.loc[82]['comments'])

Our elites are implementing this new system due to their political consideration. Let me list some of them below:

1) Social stratification is gradually emerging in school

It is getting more common to see more and more non-Chinese students in the technical or normal stream while the Chinese students filled up the majority of the slots in the express stream. For political consideration, our authority is trying to 'break up' the huge pools of non-Chinese students in the technical or normal stream by merging them with the Chinese students in the express stream but they would not be able to do that with our current streaming system, which effectively reinforces the stratification due to the differences in the students' ability.

This is because if you allow too many non-Chinese students to concentrate in a particular area (technical or normal stream classes etc), there is always a political risk that they would band together as a race (especially the Malays who are more united as a 

In [22]:
#Surprisingly how highly scored this comment is considering the number of negative words
print(comments_top10.loc[94]['comments'])

I can’t trust PAP government anymore. During that year, the minister in charge of education talked very convincing about streaming and now everything back to the same! For many years PAP government has accumulated wealth for themselves and exploited the feeling of many Singaporeans.

In fact They are destroying every Singaporeans’ dreams.
LBW said we have good AH Kong but she didn’t mentioned that our AH kong has too many illegitimate grand children that our AH Konghas to give free education to these illegitimate grand children which are foreign to us.
The country has come to this state is because our gov focus on $$$ in their pockets and not us, Singaporeans.
More


In [23]:
#Nope, not positive
print(comments_top10.loc[215]['comments'])

Mr. MP Keep your eye on the goal, to have a great education system. Dragging down top tier students in hopes to raise lower tier students is an incorrect approach. It even goes against Singapore’s motto Onwards and upwards. All levels must have a clear path to excellence and not every program fits every child. If express is a great system for the top tier, don’t change it. Fix what’s broken, provide a path, in one direction, upwards, for all students. Have a transition program for students that are identified as late bloomers, smart, but needing some motivation, etc. The fact is, philosophers and plowmen, each must do their part, to mold a better Singapore
More


## Checking out the top 10 negative comments

In [24]:
#Sort and look at the top 10 negative comments
comments_bottom10 = comments_df.sort_values('compound',ascending = True).head(10)

In [25]:
comments_bottom10

Unnamed: 0,article_id,comments,is_reply,neg,neu,pos,compound
249,3.0,"To Everyone in this Website, Especially PAP, O...",0.0,0.138,0.755,0.107,-0.9861
84,1.0,"To Everyone in this Website, Especially PAP, O...",0.0,0.137,0.755,0.109,-0.9845
46,0.0,"To Everyone in this Website, Especially PAP, O...",0.0,0.137,0.755,0.109,-0.9845
232,3.0,"good to hear from MP Louis Ng again, he used t...",0.0,0.231,0.724,0.045,-0.9719
261,3.0,Louis is 100% correct! I support his thinking ...,0.0,0.195,0.752,0.053,-0.8949
206,3.0,To SCAM or to be SCAM,0.0,0.689,0.311,0.0,-0.871
203,3.0,To SCAM or to be SCAM,1.0,0.689,0.311,0.0,-0.871
231,3.0,1) Locals charged with molesting crimes - alwa...,0.0,0.259,0.677,0.064,-0.8639
1,0.0,No matter what kind of streaming or subject ba...,0.0,0.156,0.76,0.084,-0.8555
43,0.0,Students who had gone tru streaming felt its u...,0.0,0.243,0.757,0.0,-0.8402


<div class='alert alert-block alert-warning'>
    Looking at the article ids, the top three negative comments were repeated over different articles on the topic, while another was repeated as a reply in the same article.
</div>

In [26]:
#Definitely the most negative
print(comments_bottom10.loc[249]['comments'])

To Everyone in this Website, Especially PAP, Opposition Parties & All Singaporean,

To improve our competitiveness in Global Economy , We really must REVAMP our entire school education system , in actual fact, it should have been Done it in over 20 years ago, during the 1990s .

From this website on “ Remove streaming in secondary schools to reduce social stratification ”, it has EXPOSE OUT these BIG PROBLEMS in Singapore that had spread over many years .

Unfortunately, our “ MOST EXPENSIVE GOVERNMENT IN THE WHOLE WORLD “ don’t seem to do much on it , though TALK very BIG in Mass Media that our “ MOST EXPENSIVE GOVERNMENT IN THE WHOLE WORLD “ is doing so ! ! !

The PROBLEMS that we are facing now are these , due to the very Harmful Effects of " DON' T CARE " of or , should say, SACRIFICE Professional Ethics & Moral Education for many years :===>

(1). Bad CORORATE CULTURES & Unethical SOCIAL VALUES are widespread in Singapore Business World & Singapore Society .

Even our 

In [27]:
#Topic took a complete turn
print(comments_bottom10.loc[232]['comments'])

good to hear from MP Louis Ng again, he used to be very vocal about animal rights but became silent after he entered parliament. he is the
right person to talk about the terrible scourge happening in this country,
the massive destruction of primary forests. it's alarming especially in the west part where i live, pls stop destroying the limited green just for the money. incidentally we have a new kid on the block to speak for us, he is NMP Associate Prof Walter Theseira, pls give a listen to him.


In [28]:
#Negative comment correctly identified
print(comments_bottom10.loc[261]['comments'])

Louis is 100% correct! I support his thinking and idea. Ong ye kuang is stubbornly wrong having benefited from elite, what the hell does he knows. You never in that category how do you know? Louis should challenge ong, lets have a public debate which parents wants to categorized their children. If you dont want why the govt is doing that? Our education system is sucks!


In [29]:
#On the fence about the changes, but a negative sentiment towards the education system
print(comments_bottom10.loc[1]['comments'])

No matter what kind of streaming or subject based banding, the kids will still have to study hard to get good results in order to move on to higher level. The stress level will still be the same or higher no matter what kind of education system we have here in Singapore. Our education system is just too competitive. Kids will still continue to have their tuition classes and have to study hard in order to stay the top. In short, no matter what education system in Singapore, it just can't change this fact. Everyone is just too scare to lose out.


In [30]:
#Proportion of comments flagged as positive
len(comments_df[comments_df['compound'] > 0])/len(comments_df)

0.4641509433962264

<div class='alert alert-block alert-warning'>
    The overall accuracy from sampling the top 10 positive and negative comments doesn't seem good, although the specificity (true negatives out of predicted negatives) was seemed pretty high. One possible next step in exploring the comments would be to manually vet and label the comments as positive and negative to see how accurately VADER was performing.
    <br>
    <br>
    Another step would be to explore topic modelling and see if it would be possible to cluster the type of comments (focused on education? government? unemployment?).
</div>