# Regular Expressions

In [26]:
import re
from nltk import word_tokenize, sent_tokenize, ngrams
from collections import Counter

In [3]:
text = '''Disney will have a competitive advantage over Netflix when the entertainment conglomerate launches a competing video streaming platform later this year, according to Wall Street analyst David Trainer.

″[Disney’s] got the ability to merchandise, which is another way to monetize content in a way that Netflix does not have,” Trainer said on CNBC’s “Closing Bell” Wednesday. He’s chief of the New Constructs research firm.

Netflix increased its subscription prices Tuesday, sending the stock up 6.5 percent that day as well.

However, Trainer called it a “key dilemma” for the company and it “makes their competitors more viable.” The dilemma, he explained, is that the streaming company relies too much on its subscribers to generate revenue.

“It’s a Catch-22 for a business model that, when you look at the fundamentals, really just doesn’t work,” Trainer alleged.

The Netflix price increase for U.S. subscribers ranges between 13 percent and 18 percent, which Victory Anthony of Aegis Capital sees as a positive, a view generally shared by much of the investment community.

“It’s all profit for the price increase and so they can either use that to invest in more original content or they can let that drop down to their down to the bottom line,” Anthony said on “Squawk Alley” Thursday.

Aegis put a hold on Neflix at current levels because it’s about 8 percent higher than Anthony’s price target of $325. The stock was trading steady around $351 midday Thursday, up more than 50 percent since the Christmas Eve washout. Netflix releases its fourth quarter earnings after the bell Thursday. Netflix last reported double-digit user growth, with 58 million U.S. and 78 million international subscribers.

Original content aside, Netflix has built its large subscriber base, in part, on licensed content from a number of third-party TV and movie studios that plan to crowd into the video streaming market, which already includes other established online rivals such as Amazon and Hulu.

Disney, which agreed to purchase Twenty-First Century Fox assets last summer, said it would pull its movies from Netflix when it launches Disney+ in late 2019. AT&T’s WarnerMedia announced in October it would release a platform in the fourth quarter of 2019. Apple could be dropping a service this year, Comcast’s NBCUniversal on Monday revealed plans for a free streaming program with ads slated for early 2020.

New Constructs’ Trainer said Netflix is vulnerable because it’s a “one-trick pony” with an online distribution system that is not “defensible.” The company can keep growing its subscriber base, but it will need to address cash flow, he said.

“You can count on one hand the number of firms that have, over time, successfully monetized original content. It’s an expensive, difficult proposition,” he argued. “Disney’s done it and part of the reason they’ve done [it] is because they’ve got better ways of monetizing.”'
'''

## all matches of dollar amounts in the text

In [37]:
dollar_amts = re.findall(r'\$(\d+)', text)
print(dollar_amts)

['325', '351']


# Substitute all numbers with # and print text

In [9]:
print(re.sub(r'[0-9]','#',text))

Disney will have a competitive advantage over Netflix when the entertainment conglomerate launches a competing video streaming platform later this year, according to Wall Street analyst David Trainer.

″[Disney’s] got the ability to merchandise, which is another way to monetize content in a way that Netflix does not have,” Trainer said on CNBC’s “Closing Bell” Wednesday. He’s chief of the New Constructs research firm.

Netflix increased its subscription prices Tuesday, sending the stock up #.# percent that day as well.

However, Trainer called it a “key dilemma” for the company and it “makes their competitors more viable.” The dilemma, he explained, is that the streaming company relies too much on its subscribers to generate revenue.

“It’s a Catch-## for a business model that, when you look at the fundamentals, really just doesn’t work,” Trainer alleged.

The Netflix price increase for U.S. subscribers ranges between ## percent and ## percent, which Victory Anthony of Aegis Capital se

## Print counts of ”Netflix” and “Disney” mentions

In [21]:
word_count = 0

words = [r'Netflix','Disney']
for word in words:
    count = 0
    count += len(re.findall(word, text))
    print('count of ' + word + ' is ' +  str(count) )
    word_count += count
print('total word count is ' + str(word_count))

count of Netflix is 9
count of Disney is 5
total word count is 14


## Tokenize sentences and words, print trigrams in the first 3 sentences

In [32]:
sentences = sent_tokenize(text)
sent_list = []
for sentence in sentences:
    sent_list.append(sentence) 
for sentence in sent_list[0:3]:
    tokens = word_tokenize(sentence)
    print(Counter(ngrams(tokens,3)))

Counter({('Disney', 'will', 'have'): 1, ('will', 'have', 'a'): 1, ('have', 'a', 'competitive'): 1, ('a', 'competitive', 'advantage'): 1, ('competitive', 'advantage', 'over'): 1, ('advantage', 'over', 'Netflix'): 1, ('over', 'Netflix', 'when'): 1, ('Netflix', 'when', 'the'): 1, ('when', 'the', 'entertainment'): 1, ('the', 'entertainment', 'conglomerate'): 1, ('entertainment', 'conglomerate', 'launches'): 1, ('conglomerate', 'launches', 'a'): 1, ('launches', 'a', 'competing'): 1, ('a', 'competing', 'video'): 1, ('competing', 'video', 'streaming'): 1, ('video', 'streaming', 'platform'): 1, ('streaming', 'platform', 'later'): 1, ('platform', 'later', 'this'): 1, ('later', 'this', 'year'): 1, ('this', 'year', ','): 1, ('year', ',', 'according'): 1, (',', 'according', 'to'): 1, ('according', 'to', 'Wall'): 1, ('to', 'Wall', 'Street'): 1, ('Wall', 'Street', 'analyst'): 1, ('Street', 'analyst', 'David'): 1, ('analyst', 'David', 'Trainer'): 1, ('David', 'Trainer', '.'): 1})
Counter({('″', '[', 

## Using spacy to return list of extracted entities

In [4]:
import spacy
nlp = spacy.load('en')

In [20]:
text = '''Tesla Model Y’s wiring architecture and body casting, especially at the rear, is not disappointing in the first teardown of the new electric SUV.
As we reported yesterday,  Sandy Munro, a manufacturing expert who rose to fame in the Tesla community after his breakdown of an early Model 3, is doing the first teardown of a Model Y.

Munro is doing relating details about the teardown piece by piece and you can check our previous report for Tesla Model Y’s fit and finish, frunk, and suspension.

Now the manufacturing expert is finally getting into the nitty-gritty and removing some panels to reveal some of the more hardcore manufacturing and design of the Model Y.

In a new video, Munro gives his first impression on the Model Y’s wiring, brakes, quick-connects, and rear body structure:



These elements are of high interest to the Tesla community and analysts since the automaker and CEO Elon Musk have been boasting about improvements on those fronts.

As we previously reported, Tesla has been working on a revolutionary new wiring architecture to help robots build upcoming cars like the Model Y.

Musk also said that Tesla is moving to an aluminum casting design instead of a series of stamped steel and aluminum pieces for the Model Y body:

“When we get the big casting machine, it’ll go from 70 parts to 1 with a significant reduction in capital expenditure on all the robots to put those parts together.”

A new patent application filed last year revealed this new casting machine that Tesla is using to build Model Y.

Now Munro has started taking a look at the result of those new technologies in the Model Y and he seems impressed.

First, he looked at the 12-volt wiring and he was impressed by the rugged wrapping that Tesla used on the wires – something he says is “never done”:


Munro was also impressed by Tesla’s use of a more expensive quick connect system for the AC, which he believes will greatly reduce the chance of leaks down the road:


The manufacturing expert also gave a quick look at the Model Y’s brakes in the video and he says that they are bigger and stronger than Model 3’s brakes:


However, it looks like they are still repackaged Brembo brakes.

Munro says that he sees a real improvement to the backend of the car compared to the Model 3:

When Tesla did the Model 3, I was really unhappy with the backend of the car. I saw hundreds of parts that shouldn’t have been there.

For the Model Y, Munro is happy that Tesla changed the design with a whole boot made in one piece, which he believes is made of glass-filled nylon.


Around it, Munro noted a “gigantic aluminum casting” that makes up the back of the Model Y’s body, which was one of the biggest improvements from Model 3 to Model Y as we previously mentioned.

The manufacturing expert believes that Tesla used some of its criticism of the Model 3 to do those improvements to the Model Y.

Musk acknowledged that he appreciates Munro’s feedback:

“High quality critical feedback from Munro & Co is much appreciated!”

Overall, it appears the manufacturing expert is really happy with the Model Y, but there’s still a lot more to look into. He expects to release more videos in the coming days.
'''

In [21]:
doc = nlp(text.strip())

In [22]:
import pandas as pd

ent_text = []
ent_label = []
for ent in doc.ents:
    ent_text.append(ent.text)
    ent_label.append(ent.label_)

In [55]:
d = [ent_text, ent_label]
df = pd.DataFrame(d).T
df.columns = ['text','label']
df

Unnamed: 0,text,label
0,\n,GPE
1,yesterday,DATE
2,Sandy Munro,PERSON
3,Tesla,PERSON
4,early Model 3,PRODUCT
5,first,ORDINAL
6,Model Y.\n\nMunro,PERSON
7,Tesla Model,PRODUCT
8,the Model Y.\n\n,PRODUCT
9,Munro,ORG
