**So the basic overview of this project is to scrape heap lot of data from google news and yahoo finance and then use a deep learning algorithm for summarizing the text and then making sentiment analysis on the summarised text**

# 1.Install and Import Dependencies

**Transformers is a `NLP` library which is a part of huging face and is having a heap lot of functions which are really important and amazing,so we will be importing the `PegasusTokenizer` and the `PegasusModel` itself,this is a state of art deep learning algorithm which means that this a pretrained model,like we had in case of `VGG16/VGG19` similar to that kind and this is trained on text data so we will use this for our `text-summarisation`,more importantly this model works on summarization of financial text data**

In [2]:
import pandas as pd
from transformers import PegasusTokenizer,PegasusForConditionalGeneration
from bs4 import BeautifulSoup
import requests

# 2.Setup Summarization Model

**So as mentioned earlier that we will be using pretrained model for our `financial-text-summarisation`,so we will be using the `financial-summarization-pegasus` model as shown ↓ which is present in the Pegasus Tokenizer and we will install that model and the load it using the ForConditionalGeneration.from_pretained() as shown ↓**

**So in short the tokenizer's main responsibility is to encode and decode the data and the responsibility of the model is to make summarization of the financial text data**

In [3]:
model_name = 'human-centered-summarization/financial-summarization-pegasus'
tokenizer = PegasusTokenizer.from_pretrained(model_name)
model = PegasusForConditionalGeneration.from_pretrained(model_name)

Downloading:   0%|          | 0.00/1.91M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.34k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.44k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.27k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/2.28G [00:00<?, ?B/s]

# 3.Summarize a Single Article

**So here first we will be sending a request to the web using the requests library and the specified url then we will create a `BeautifulSoup` object as shown and then we will scrape all the `paragraphs` from that specified url as shown ↓,so basically if we type in `r.text` we get the response in form of text and the response is `HTML,CSS,JS` as we're requesting for a  webpage on the internet and we are intrested in scraping the `<p>` tags right,so we will specify our parser to be as `htmlparser` which will scrape all the text which is present in the `<p>` tags in the whole webpage!**

In [13]:
## sample run for extracting the data and summarizing a single article

url = 'https://au.news.yahoo.com/woolworths-iga-and-ikea-stores-among-latest-covid-exposure-sites-223840740.html'

r = requests.get(url)

soup = BeautifulSoup(r.text,'html.parser')

paragraphs = []

for paras in soup.find_all('p'):
    paragraphs.append(paras.text)

In [16]:
print(len(paragraphs))
print(paragraphs[:3])

40
["NSW exposure sites ballooned once again overnight as health authorities desperately try to get to grips with Sydney's daunting Covid-19 outbreak.", 'Of particular concern is the Raw Coffee Bar in Belmore in the inner-west where an infectious worker spent 10 days at the premises from July 7 to July 16.', 'Anyone who visited the coffee shop during the listed times must isolate for 14 days and seek testing immediately.\xa0']


In [20]:
## joining all the strings and grabbing only first 400 words
## we chose 400 as a random number as the sumarisation model has a limit
## of how many inputs must be given so for safe side i have used 400

words = ' '.join(paragraphs).split(' ')[:400]
article = ' '.join(words)
article

'NSW exposure sites ballooned once again overnight as health authorities desperately try to get to grips with Sydney\'s daunting Covid-19 outbreak. Of particular concern is the Raw Coffee Bar in Belmore in the inner-west where an infectious worker spent 10 days at the premises from July 7 to July 16. Anyone who visited the coffee shop during the listed times must isolate for 14 days and seek testing immediately.\xa0 Ten new bus route in the city\'s west are listed as close contact exposure routes. There is also a raft of new casual contact exposure sites, including Kmart at Glebe\'s Broadway. Anyone who visited between 3.40pm and 4.25pm on July 8 and 6.45pm and 7pm on the same day must isolate. IGA in Dulwich Hill, Woolworths in Merrylands and Ikea in Marsden Park are also included on the ever-growing list of exposure sites.  More than 100 cases were infectious in the community over the past four days, a statistic causing significant concern to health authorities as they try and dramat

In [21]:
print(len(words),words[:4])

400 ['NSW', 'exposure', 'sites', 'ballooned']


**So here first step is to encode the text,so we will be encoding our text using the tokenizer model which we used earlier,and then we will feed in that encoded input to the model and then decode the result which the model has produced as shown ↓**

In [22]:
input_ids = tokenizer.encode(article,return_tensors='pt')
output = model.generate(input_ids,max_length=55,num_beams=5,early_stopping=True)
summary = tokenizer.decode(output[0],skip_special_tokens=True)

**So the first line which is the encoding phase is taking in input the article and then converting that article into `pytorch tensors` basically is doing encoding which is the basic step in any NLP process,as we can see the result below,on an overview each of the 400 words have been assigned to an `pytorch tensor id` using the `tokenizer.encode()` method as shown ↓**

In [31]:
input_ids[:1][0][:5]

tensor([10871,  3411,  1196, 12120,   316])

**Then once we have generated the input ids,we have to feed that to the model so that it can generate the relavent outputid's which contain the id's of the `summarised` text,then we have to simply pass that to the `tokenizer.decode()`,so this will decode that output_id into the text format and we get the summary of the whole heap of paragraph which consisited of 400 words,so this model internally uses an algorithm named `beamsearch` which will generate a sequence of "meaningful text",early_stopping is used to overcome `overfitting`,so here is output which is a tensors of `id` so we need to decode this so that we can get the output in form of text which is the summary for which we're looking for!**

In [32]:
output

tensor([[    0, 13115,  6288,  2484,   115, 10941,  4378,   117,   790,   177,
          3411,  1196,   107,  1439,   197,  1061,   200, 10251,   115,  4445,
           204,   555,   541,   390,     1]])

In [23]:
summary

'Raw Coffee Bar in Belmore is among new exposure sites. More than 100 people infected in Sydney over past four days'

# 4.Building a News and Sentiment Pipeline

In [33]:
## i will be scrapping stock related data of these GME,Tesla and Bitcoin
## so we will be searching for stock related data of these metioned in list
## from google news and yahoo finance

monitored_tickers = ['GME','TSLA','BTC']

### 4.1.Search for Stock News using Google and Yahoo Finance

**So here we will set up the url as per our needs and make a request to the internet using the requests module,and then use the response which was generated from the requests as a parameter to the beautiful soup and parse out all the a tags from that particular page,and return the scraped href's,best way to do them is by using seleinium tool**

In [65]:
import selenium
from selenium import webdriver
import time
from selenium.webdriver.chrome.options import Options
from webdriver_manager.chrome import ChromeDriverManager
from collections import defaultdict
import pandas as pd
import numpy as np

In [42]:
options = Options()
options.add_argument("start-maximized")
options.add_argument("--disable-extensions")

options.add_experimental_option("prefs", { \
"profile.default_content_setting_values.media_stream_mic": 1,
"profile.default_content_setting_values.media_stream_camera": 1,
"profile.default_content_setting_values.geolocation": 1,
"profile.default_content_setting_values.notifications": 1
})

# options.add_argument("--incognito")

In [52]:
'''
This code is to scrape href's from the desired webpage
With all the pagination and everything we get the url's
Then we will be looping through each one of those url's and 
scraping the text data {<p>} tags text from each of those pages
and forming summarization of the text!

'''

def searchforstocknews(ticker,findtaglist,paginationlist):
    
    driver = webdriver.Chrome(options=options,
                              executable_path=ChromeDriverManager().install())
    
    searchurl = 'https://www.google.com/search?q=yahoo+finance+'+str(ticker)+'&tbm=nws'
    
    url = driver.get(searchurl)
    hrefs = []
    
    try:
        for clicker in paginationlist:

            try:

                for tag in findtaglist:
                    hrefs.append(driver.find_element_by_xpath(tag).get_attribute('href'))

            except:
                hrefs.append(' ')

            driver.find_element_by_xpath(clicker).click()
            
    except:
        pass

    
    return hrefs

paginationlist = ['//*[@id="xjs"]/table/tbody/tr/td['+str(i)+']/a' for i in range(3,21)]
findtaglist = ['//*[@id="rso"]/div['+str(i)+']/g-card/div/div/div[2]/a' for i in range(1,101)]

'''
Running a loop over all the monitored stocks and scraping the links from the 
search result page using the searchforstocknewsfunction as shown ↓

'''
finalmonitoredlinks = {}
for stock in monitored_tickers:
    finalmonitoredlinks[stock] = searchforstocknews(stock,findtaglist,paginationlist)



Current google-chrome version is 91.0.4472
Get LATEST driver version for 91.0.4472
Driver [C:\Users\meet\.wdm\drivers\chromedriver\win32\91.0.4472.101\chromedriver.exe] found in cache


Current google-chrome version is 91.0.4472
Get LATEST driver version for 91.0.4472
Driver [C:\Users\meet\.wdm\drivers\chromedriver\win32\91.0.4472.101\chromedriver.exe] found in cache


Current google-chrome version is 91.0.4472
Get LATEST driver version for 91.0.4472
Driver [C:\Users\meet\.wdm\drivers\chromedriver\win32\91.0.4472.101\chromedriver.exe] found in cache


In [58]:
for key,links in finalmonitoredlinks.items():
    print(f' Stock {key} has {len(links)} links of data')
    print('-'*50)

 Stock GME has 165 links of data
--------------------------------------------------
 Stock TSLA has 176 links of data
--------------------------------------------------
 Stock BTC has 165 links of data
--------------------------------------------------


##### We will balance out the links as Tesla is having more



In [59]:
finalmonitoredlinks['TSLA'] = finalmonitoredlinks['TSLA'][:165]
for key,links in finalmonitoredlinks.items():
    print(f' Stock {key} has {len(links)} links of data')
    print('-'*50)

 Stock GME has 165 links of data
--------------------------------------------------
 Stock TSLA has 165 links of data
--------------------------------------------------
 Stock BTC has 165 links of data
--------------------------------------------------


In [61]:
finalmonitoredlinks

{'GME': ['https://finance.yahoo.com/news/beyond-gamestop-reddit-wallstreetbets-now-142205449.html',
  'https://finance.yahoo.com/news/gamestop-investors-speculate-meme-stock-203707260.html',
  'https://finance.yahoo.com/news/gamestop-gme-down-13-2-153003874.html',
  'https://finance.yahoo.com/news/you-could-lose-everything-on-meme-stocks-franklin-templeton-ceo-135011534.html',
  'https://finance.yahoo.com/news/i-will-never-cover-game-stop-stock-ever-again-top-analyst-173202488.html',
  'https://finance.yahoo.com/news/why-the-meme-stock-revolution-will-last-093624458.html',
  'https://uk.finance.yahoo.com/news/happening-gamestop-gme-share-price-103815992.html',
  'https://finance.yahoo.com/news/10-reddit-wallstreetbets-meme-stocks-210143666.html',
  'https://finance.yahoo.com/news/meme-stocks-show-that-community-is-profitable-reddit-co-founder-173909883.html',
  'https://finance.yahoo.com/news/forget-amc-gamestop-10-stocks-205845496.html',
  ' ',
  'https://finance.yahoo.com/news/meme-s

### 4.2 Now we need to loop through each url and get the 'p' tags

In [66]:
'''
Try-Excpet is added just to handle out the errors!
Scraping all the text from the webpage and returning as article 

'''
%%time
def gettext(url):
    
    try:
        textdata = []
        response = requests.get(url)
        soup = BeautifulSoup(response.text,'html.parser')
        for para in soup.find_all('p'):
            textdata.append(para.text)

        ## joining all the strings and grabbing only first 400 words
        ## we chose 400 as a random number as the sumarisation model has a limit
        ## of how many inputs must be given so for safe side i have used 400

        words = ' '.join(textdata).split(' ')[:400]
        article = ' '.join(words)
    
    except:
        pass
    
    return article


overalldatamap = defaultdict(list)
for stock,links in finalmonitoredlinks.items():
    overalldata = []
    for link in links:
        if link!=' ':
            overalldatamap[stock].append(gettext(link))
            
    

Wall time: 9min 1s


In [68]:
overalldatamap = dict(overalldatamap)
overalldatamap

{'GME': [" In this article, we discuss the 10 stocks Reddit's WallStreetBets is now targeting. If you want to skip our detailed analysis of these stocks, go directly to Beyond GameStop: Reddit's WallStreetBets is Now Targeting These 5 Stocks. Retail investors have been exercising ever greater influence on the overall market dynamics the past few months, as the record stock rallies of GameStop Corp. (NYSE: GME) and AMC Entertainment Holdings, Inc. (NYSE: AMC) have demonstrated. However, as their influence increases, so does the list of stocks that gain from this influence. The most popular platform where these retail investors exchange ideas is the Reddit forum WallStreetBets. The forum has more than 10.6 million members and is one of the hottest online places in finance.  Some of the stocks presently popular on WallStreetBets include Alibaba Group Holding Limited (NYSE: BABA), Virgin Galactic Holdings, Inc. (NYSE: SPCE), and MicroVision, Inc. (NASDAQ: MVIS). Interestingly, the theory t

In [69]:
for stock,data in overalldatamap.items():
    print(f' Stock {stock} has {len(data)} scraped')
    print('-'*50)

 Stock GME has 150 scraped
--------------------------------------------------
 Stock TSLA has 150 scraped
--------------------------------------------------
 Stock BTC has 150 scraped
--------------------------------------------------


### 4.3 Organising the Data into DataFrame

In [76]:
## as we have scraped text data we need to make sure that 
## there is no duplicacy,so we will be handling that part here as shown 
## removing the duplicates and organising the data into dataframe
## after we have received the clean data we will organise that into
## pandas dataframes,and then we will apply the summarisation logic
## to everytext which we have scraped so far as shown ↓ in the upcoming cells



gme = list(set(overalldatamap['GME']))
tsla = list(set(overalldatamap['TSLA']))
btc = list(set(overalldatamap['BTC']))
df = pd.DataFrame()
df['StockText'] = np.array(gme+tsla+btc)
df['StockOf'] = np.array(['GME']*len(gme)+['TSLA']*len(tsla)+['BTC']*len(btc))
df = df[['StockOf','StockText']]
df

Unnamed: 0,StockOf,StockText
0,GME,Investors in GameStop Corp. GME need to pay cl...
1,GME,The WallStreetBets bunch has set their sights ...
2,GME,"Reddit investors pack a powerful punch, just a..."
3,GME,In what has simultaneously been one of Wall St...
4,GME,"NEW YORK, NY / ACCESSWIRE / July 16, 2021 / L..."
...,...,...
254,BTC,China is cracking down on Bitcoin. Blockchain....
255,BTC,In 2016 a month’s bitcoin mining with a home c...
256,BTC,Bitcoin (BTC) is trading inside a range betwee...
257,BTC,Bitcoin (BTC) broke down on June 22. falling t...


In [77]:
df.shape

(259, 2)

### 4.4.Summarise all Articles

**The working of this function has been illustrated in the `sample run` which was made in section 2 one can check that out for reference,now as we're having a heap lot of text,we're gonna loop through each text and feed that article as an input to the model so that it can return the summary of that article!,so inorder to make it generalised i will be making a function as shown ↓,so once we get the summarised version we will look whether we're having any empty text or not,or any kind of 'NAN' value,if so we will be ignoring that,as the dataset is quiet low,in general while dealing with large datasets,we cannot ignore that we have to find a way to handle that thing because that may play an important role as such,below is the proposed model arhcitecture which is follwed along the process**

![13.PNG](attachment:13.PNG)

In [82]:
## this function will return summary of that heap long article!
## there will be few cases where you will be encountring index errors
## to avoid them just you can use the try-except mechanism or 
## we can also use some other logic to handle exceptions
## instead of looping all through i have simply applied 
## pandas .apply(lambda x : func(x)) 

%%time

def generatesummary(article):
    
    try:
        input_ids = tokenizer.encode(article,return_tensors='pt')
        output = model.generate(input_ids,max_length=55,num_beams=5,early_stopping=True)
        summary = tokenizer.decode(output[0],skip_special_tokens=True)
        return summary
        
    except:
        pass
    
df['Summary'] = df['StockText'].apply(lambda article:generatesummary(article))

Wall time: 45min 40s


In [83]:
df

Unnamed: 0,StockOf,StockText,Summary
0,GME,Investors in GameStop Corp. GME need to pay cl...,"Implied volatility is high for the Jul 16, 202..."
1,GME,The WallStreetBets bunch has set their sights ...,Retail investors are piling into meme stocks. ...
2,GME,"Reddit investors pack a powerful punch, just a...",London-based White Square Capital to close dow...
3,GME,In what has simultaneously been one of Wall St...,Geo Group may be Wall Street’s next big meme s...
4,GME,"NEW YORK, NY / ACCESSWIRE / July 16, 2021 / L...",
...,...,...,...
254,BTC,China is cracking down on Bitcoin. Blockchain....,"Peter Smith, Blockchain.com, says crackdown is..."
255,BTC,In 2016 a month’s bitcoin mining with a home c...,In 2016 a month’s bitcoin mining with a home s...
256,BTC,Bitcoin (BTC) is trading inside a range betwee...,
257,BTC,Bitcoin (BTC) broke down on June 22. falling t...,


In [88]:
## looking for BTC stocks

df[df['StockOf']=='BTC'].style.background_gradient(cmap ='plasma')

Unnamed: 0,StockOf,StockText,Summary
151,BTC,"The crypto market was deeply in the red on Monday, extending a sell-off that began late last week and gathered pace over the weekend. Bitcoin (BTC-USD), the world's biggest cryptocurrency, was down 5.6% to $32,200 (£23,207) by 2.15pm in London. It marked the lowest level for bitcoin in nearly two weeks. Other major tokens were also in the red, with ethereum (ETH-USD) down 7.7% to $1,930 and XRP (XRP-USD) down 8.1% to $0.66. Dogecoin (DOGE-USD), the joke token propelled into the mainstream earlier this year, was down 18.1% to $0.21. By Monday morning, the entire crypto market had lost 6.6% of its value over the last 24 hours, according to data provider CoinMarketCap. The sell-off extends a slump that began late last week after the US Federal Reserve struck an unexpectedly hawkish tilt at its latest policy meeting. The central bank brought forward the timeline for future interest rate hikes, raising the prospect that cheap money may disappearing sooner than expected. The Fed's shift in stance sparked a move away from risk assets and cryptocurrencies have been caught up in the selling. CoinDesk reported that bitcoin fund holdings hit a four month low on Friday in the wake of the Fed meeting. ""Money originates at the Fed then flows through the markets,"" Mati Greenspan, the founder and chief executive of Quantum Economics, told Yahoo Finance UK. ""Now that they're talking about slowing down the pace of new money, or at least signalling that stimulus is not infinite, investors are starting to ask themselves what the true value of these assets are."" Headlines about an ongoing Chinese crackdown on cryptocurrency didn't help sentiment over the weekend. The Block, a cryptocurrency industry website, reported that regulators in Sichuan province had issued a statement ordering crypto miners in the region to shut down. Ruud Feltkamp, chief executive of crypto trading bot Cryptohopper, said: ""People still react strongly to actions from China that create uncertainty, so this is likely to reflect negatively on the Bitcoin price."" Feltkamp and other industry experts said the crackdown would likely have a short-term effect on mining capacity and prices but would likely just drive miners to relocate, rather than exit the industry altogether. ""Worldwide, miners continue to generate on average $30m daily, which shows the industry is still highly profitable,"" said Ulrik K. Lykke, executive director at digital asset fund ARK36. ""It can be reasonably expected that the","Bitcoin tumbles to lowest level in nearly two weeks. Ethereum, Dogecoin hit by regulators in China"
152,BTC,"After another mixed day for the crypto majors on Thursday, it’s been yet another bearish morning for Bitcoin and the broader crypto market. At the time of writing, Bitcoin, BTC to USD, was down by 1.74% to $32,971.0. A mixed start to the day saw Bitcoin rise to an early morning high $33,983.0 before hitting reverse. Falling well short of the first major resistance level at $34,842, Bitcoin slid to a mid-morning intraday low $32,700.0. Steering clear of the first major support level at $32,493, however, Bitcoin found support to revisit $33,000 levels before easing back. Key through the early hours was avoiding the first major support level and a return to sub-$32,000 levels. It’s been a bearish morning for the broader crypto market. At the time of writing, Chainlink was down by 4.51% to lead the way down. Bitcoin Cash SV (-3.40%), Binance Coin (-3.33%), Ethereum (-3.39%), and Litecoin (-3.77%) also struggled. Cardano’s ADA (-2.16%), Crypto.com Coin (-2.58%), Polkadot (-1.55%), and Ripple’s XRP (-2.72%) saw relatively modest losses through the morning, however. Through the early hours, the crypto total market rose to an early morning high $1,380bn before falling to a low $1,325bn. At the time of writing, the total market cap stood at $1,334bn. Bitcoin’s dominance fell to an early low 45.98% before rising to a high 46.44%. At the time of writing, Bitcoin’s dominance stood at 46.38%. Bitcoin would need to move back through the $33,781 pivot to bring the first major resistance level at $34,842 into play. Support from the broader market would be needed, however, for Bitcoin to break back through $34,000 levels. Barring a broad-based crypto rebound, the first major resistance level and Thursday’s high $35,069.0 would likely cap any upside. In the event of an extended crypto rally, Bitcoin could test resistance at $37,000 levels. The second major resistance level sits at $36,130. Failure to move back through the $33,781 pivot would bring the first major support level at $32,493 back into play. Barring an extended sell-off through the afternoon, however, Bitcoin should steer clear of sub-$30,000 support levels. The second major support level at $31,432 should limit the downside. Looking beyond the support and resistance levels, we saw a bearish cross this morning, with the 100 EMA crossing through the 200 EMA. This followed yesterday’s bearish cross, where the 50 crossed through the 100 EMA and 200 EMA. A further pullback by",
153,BTC,"Bitcoin (BTC) has been moving upwards since rebounding on June 22. It reached a local high of $34,881 the next day. Whether it finds support at the current price level or gets rejected will likely determine the direction of the future trend. Bitcoin has been moving upwards since June 22 after reaching a local low of $28,805. The bounce that followed was preceded by bullish divergences in both the RSI and the MACD signal line (blue lines). The next day, BTC reached a high of $34,881. However, it has been moving downwards since. While the bullish divergences have already played out, technical indicators are bearish. The MACD and RSI are both decreasing. The former is negative while the latter has fallen below 50. In addition to this, the Stochastic oscillator is very close to making a bearish cross (red circle). The closest support and resistance levels are at $31,300 and $41,500. The two-hour chart shows that BTC has already broken out from a descending resistance line, which had previously been in place since June 15. However, it was rejected by the 0.5 Fib retracement resistance level at $35,070. Currently, BTC is in the process of validating the line as support (green icon). The line also coincides with the 0.382 Fib retracement support level (white) Despite this, technical indicators do not show any bullish signs. The RSI has fallen below 50 and the MACD has given a bearish reversal signal. It’s not yet clear if the wave count is bullish or bearish. However, yesterday’s decrease below the $33,367 low (red line) suggests the latter. The increase from the lows looks to be a three-wave structure instead of a bullish impulse. For that to be true, the ongoing decrease would have to be a 1-2/1-2 wave structure. Therefore, a sharp decrease would follow. A move above the descending resistance line and the $33,869 high would suggest that this is not the case and the count is bullish instead. If BTC breaks out, it would likely mean that the current movement made up waves one and two (black) of a bullish impulse, that would complete a longer-term wave A (white). Therefore, the movement in the next few hours is crucial in determining the direction of the trend. For BeInCrypto’s latest bitcoin (BTC) analysis, click here. Related Quotes Supermajors are piling into the world’s hottest offshore drilling location after a string of major discoveries",Technical indicators do not show any bullish signs. Bitcoin has already broken out of a descending resistance line
154,BTC,"Check back at 8:30 a.m. ET for the results Bitcoin has experienced no shortage of news and announcements, as the seminal cryptocurrency attempts to mount a recovery following May’s rout and June’s outperformance over other major coins. The Fundamental Look Ahead of the 100th anniversary of China’s Communist Party, the government is continuing its crackdown on Bitcoin usage in the country. On Tuesday, July 6th, the People’s Bank of China warned local companies against providing any support or services to cryptocurrency-related services, this time focusing its attention on software provision. Some perceive the clampdown as an effort to prevent any embarrassing events from overshadowing the forthcoming anniversary and possibly fighting capital flight. Nonetheless, the clampdown might continue to pressure BTC prices over the near term. (See Bitcoin stock comparison on TipRanks) One of the most interesting outcomes from the exodus of miners from China was the major 28% downward mining difficulty adjustment on July 3rd. While the network hash rate fell to its lowest point since September of 2019, operational miners likely experienced a significant jump in profitability. That's because the adjustment made mining easier. In adoption news, payment services provider Allied Payment Network has embraced Bitcoin after partnering with New York Digital Investment Group. The provider will now offer Bitcoin payment services to its network of customers and clients, meaning that connected financial institutions can now offer their customers the ability to buy, hold, and sell Bitcoin. Moreover, Allied has announced its intention to add Bitcoin to its corporate treasury account. The other major event worth mentioning is the passing of Mircea Popescu, a famed Bitcoin agitator and maximalist known by the moniker “the Father of Bitcoin toxicity.” Popescu reportedly drowned off the coast of Costa Rica. The mysterious individual once threatened to dump a million Bitcoin on the market if proposed changes were made to the network, earning him the previously mentioned moniker. However, if true, there are concerns that his Bitcoin holdings may hit the market or be lost forever. Whales of the Week A whale is a term used in cryptocurrency to describe a person or entity that maintains a large amount of bitcoin. Whales' movements can possibly manipulate valuations of bitcoin. Here are some of last week's movements of large amounts of bitcoin: ● July 1: 6000.000 BTC moved from unknown wallet to Bitstamp ● July 2: 2217.600 BTC moved from multiple wallets to OKEx",clampdown on Bitcoin usage in China continues. Payment services provider Allied joins New York Digital
155,BTC,"Bitcoin (BTC) has been decreasing since being rejected at the $36,600 level on June 29. It’s potentially trading inside a short-term ascending wedge, which is normally considered a bearish pattern. BTC has been increasing since bouncing above the $31,400 horizontal support area on June 22. So far, it has managed to reach a high of $36,600, doing so on June 29. The upward movement was preceded by significant bullish divergence in the MACD, RSI, and Stochastic oscillator. However, neither has a bullish reading yet. The RSI is just below 50 and the MACD signal line is still negative. Furthermore, the Stochastic oscillator has already made a bearish cross. The BTC price fell yesterday but did not create a bearish engulfing candlestick. Nevertheless, it seems to have resumed its downward movement today. The main resistance area is at $40,550. This is a horizontal resistance and the 0.382 Fib retracement level. The two-hour chart shows that BTC is potentially trading inside an ascending wedge. The resistance line of the pattern has been touched multiple times while the support line is not yet confirmed since it has only been touched twice. There is strong support at $32,640 (the 0.618 Fib retracement support level) and the potential ascending support line of the wedge. The MACD and RSI are both decreasing, supporting the possibility that the price will fall towards this level. The longer-term count suggests that BTC is still in wave five of a bearish impulse. The aforementioned June 29 rejection occurred right at the 0.618 Fib retracement resistance level. it was likely the top of sub-wave two (black). While corrective structures are often contained inside parallel channels, the high failed to reach the resistance line of the channel (red arrow). In addition, BTC has now fallen below the midline of the channel, another bearish sign. A breakdown from the channel would likely confirm the downward trend and indicate that BTC will decrease towards $23,600 and potentially $19,800. For BeInCrypto’s previous bitcoin (BTC) analysis, click here. Related Quotes Chipmaker Micron Technology reported earnings after the bell. Yahoo Finance's Jared Blikre breaks it down. Shares of Chinese electric-vehicle maker NIO (NYSE: NIO) were moving higher in early trading on Wednesday, after a Wall Street analyst raised his bank's price target for the shares in a bullish note. As of 10:15 a.m. EDT, NIO's American depositary shares were up about 5.1% from Tuesday's closing price.",Bitcoin is potentially trading inside a short-term ascending wedge.
156,BTC,"Bitcoin prices are on track for a record second-quarter percentage decline, weighed down by China’s crackdown, concerns the U.S. Federal Reserve will start tapering its stimulus program and persistent demand for downside hedges. The leading cryptocurrency was trading near $34,824 at 10:39 UTC on Wednesday (6:39 a.m. Eastern time), down almost 41% for the April to June period. The drop snaps a four-quarter winning streak that saw prices chart a sixfold rise to almost $60,000, according to Bitstamp data. The historically strong quarter began on a positive note, with bitcoin rallying to a record $64,801 in the run-up to the Nasdaq debut of cryptocurrency exchange Coinbase on April 14. However, momentum stalled in subsequent weeks as retail investors struggled to do the heavy lifting in the wake of selling by large investors. Related: ‘They’d Rather Buy Bitcoin and Take That Risk’: An Interview With Paxful’s Nena Nwachukwu The market, therefore, looked weak and took a beating in mid-May after U.S. electric-car maker Tesla delisted bitcoin as a payments alternative, citing environmental concerns and dashing hopes for widespread corporate adoption. China’s reiteration of the crypto-mining ban and concerns of an early unwinding of stimulus by the Fed amplified the bearish move, pushing prices down to a then four-month low of $30,000. Since then, bitcoin has traded mainly in the range of $30,000 to $40,000, barring a brief drop to $28,600 on June 22. Sentiment has turned quite bearish, as evidenced by directionless trading in the wake of El Salvador’s decision to adopt the cryptocurrency as a legal tender. Moreover, fears of a deeper sell-off persist, according to the so-called options smile – a chart created by plotting implied volatilities against options at various strike prices expiring on the same date. Implied volatility refers to investors’ expectations of price turbulence over a specific period. It is influenced by demand for call and put options. The option smile for short-date and near-dated expiries carries a steep slope at strikes lower than bitcoin’s current price. That’s a sign of relatively higher implied volatility or demand for options at lower strikes than those at higher strikes. In other words, investors are continuing to buy protective puts – contracts that give purchaser the right but not the obligation to sell the underlying asset, in this case bitcoin, at a predetermined price on or before a specific date. Related: China’s Bitcoin Mining Crackdown Is a Boon for",
157,BTC,"Robert Kiyosaki has stated that buying bitcoin, gold, and silver are good bets as he expects the biggest crash in world history is upon us. The American businessman and author of the acclaimed book “Rich Dad, Poor Dad” has expressed his views on the current financial market. Kiyosaki is a bitcoin maximalist who has become very vocal about his opinions on bitcoin in recent months. The 74-year-old’s recent tweet backs gold, silver, and bitcoin as safe haven assets pending a worldwide crash in the financial market. In a tweet, Kiyosaki stated “The best time to prepare for a crash is before the crash. The biggest crash in world history is coming. The good news is the best time to get rich is during a crash. Bad news is the next crash will be a long one. Get more gold, silver, and Bitcoin while you can. Take care.” The famed author has expressed his concerns related to the current financial markets on several occasions, reiterating his beliefs that the markets could see a massive crash in the future. In a tweet from last week he once again backed bitcoin, saying “Biggest bubble in world history getting bigger. Biggest crash in world history coming. Buying more gold and silver. Waiting for Bitcoin to drop to $24 k. Crashes best time to get rich. Take care.” While Kiyosaki expects to buy bitcoin around $24,000, the price appears to have stabilized over $30,000. With the current price sitting at $34,200 on Monday. The bitcoin mining hashrate recently hit a 13-month low following the recent ban on bitcoin mining in China. However, the price of bitcoin appears to have stabilized for the time being. While Kiyosaki backs bitcoin as a store of value with the likes of gold and silver, countries and corporations continue to see bitcoin as a viable asset in terms of diversifying risk and as a store of value. Recently, Mexican billionaire Ricardo Salinas Pliego recommended that bitcoin be used in the country. Pliego stated that his bank is working to become the first in the country to accept bitcoin. Related Quotes SafeDollar, an algorithmic stablecoin on Polygon aiming to ""become the most safe stable coin for Defi, (decentralized finance)"" according to its website, saw its value crash to $0 following a... Coinbase was the most complained-about crypto digital wallet in the Consumer Finance Protection Bureau’s complaint database, with the volume of",American businessman and author of acclaimed book ‘Rich Dad’. Robert has expressed his concerns on the current financial market
158,BTC,"Cryptocurrencies suffered another brutal weekend with bitcoin (BTC-USD) approaching $30,000 and ethereum (ETH-USD) sliding under $2,000 after China toughened its stance on crypto payments. But Coinbase customers now have access to an ethereum competitor called Polkadot (DOT1-USD), which is gaining the attention of Wall Street. At a recent Yahoo Finance Plus webinar, Capital2Markets President Keith Bliss compared the two developer networks, which are designed to decentralized finance. ""Polkadot is an ethereum competitor and a lot of programmers are now using that blockchain to build applications off of, because they consider it a little bit safer. It appears to have plugged in some of the pinholes that you see in the ethereum blockchain that some programmers and engineers have cited as problems going forward,"" he said. Both platforms use smart contracts, but polkadot goes a step further, allowing developers to build their own blockchains that can connect to each other. Ethereum 2.0 is expected to add similar functionality when it goes live later this year or in 2021. Polkadot does face headwinds though, not the least of which are crashing crypto prices, which are souring institutional interest and attracting regulatory scrutiny. Last Wednesday, the U.S. Securities and Exchange Commission delayed one of the many pending bitcoin ETFs before the agency. There's also no guarantee that polkadot will be widely adopted. Bliss urges crypto investors to do their own research and make sure the underlying business or concept is bona fide. ""[G]iven all the coins that are out there, it's easy to do some research and find the ones that have real businesses underpinning them in their movement, like Polkadot ... [T]hen you could make the investment accordingly if you don't want to have the volatile ride across the entire complex."" Jared Blikre is an anchor and reporter focused on the markets on Yahoo Finance Live. Follow him @SPYJared Upgrade now to stay ahead of the market with Yahoo Finance Plus. Here's why bitcoin is so volatile: trader Stock market will likely be boring this summer: veteran trader The Ichimoku Cloud says remain bullish on Tesla stock: analyst Dogecoin is still in a steep uptrend but be careful: analyst Related Quotes Fundstrat Lead Digital Asset Strategist, David Grider, joins Yahoo Finance to discuss how the crypto space is experiencing an extended time of volatility as China’s crackdown and the Fed’s hawkish stance could affect the future of the crypto economy. It’s back into","Polkadot is an ethereum competitor, Capital2Markets president says."
159,BTC,"Michele Schneider joins Jared Blikre to talk momentum trades and advanced charting on Wed, 7/14 at 2PM EDT An analyst who predicted the bitcoin mid-May price slide says the cryptocurrency’s current range play is likely to be resolved on the higher side. “The consolidation phase itself is neutral, but we think a breakout is more likely than a breakdown,” Katie Stockton, founder and managing partner of Fairlead Strategies, said in a research note published on Monday. “Intermediate-term momentum has been improving based on the MACD histogram.” Bitcoin has been trading between $30,000 and $40,000 since late May. The price range has narrowed further in the past two weeks, with bulls unwilling to send prices above $36,000 and sellers refusing to step in below $32,000. Related: US Financial Giant Capital Group Buys 12% Stake in Bitcoin-Exposed MicroStrategy A big move looks overdue and could be bullish, as the weekly chart MACD histogram, an indicator used to gauge trend strength and trend changes, has turned higher, having bottomed out in mid-June. The consecutive shallow bars below the zero line indicate seller exhaustion. The relative strength index continues to indicate oversold conditions with a below-30 print. “Intermediate-term oversold conditions have generated stabilization above $30,000, which has proven to be strong support for bitcoin,” Stockton said. According to Stockton, the expected breakout would be confirmed on consecutive daily UTC closes above the 50-day simple moving average (SMA) at $35,500. That would the doors to the next resistance level, near $44,000. Related: Market Wrap: Bitcoin Drops as Traders Await June CPI Inflation Report The 50-day SMA is one of the most widely-tracked technical lines. Stockton mentioned it as the level to defend for the bulls back in April, when prices were trading well above average. The SMA support was breached on April 20 and was followed by a sell-off in May. At press time, bitcoin is trading little changed on the day near $33,200. A break below the long-held support at $30,000 could invite chart-driven sellers. However, Stockton sees a low probability of a range breakdown. Also read: Market Wrap: Bitcoin Drops as Traders Await June CPI Inflation Report Bitcoin Listless as New ‘Bearish Crossover’ Looms Synthetix Rallies as DeFi Protocol Announces Layer 2 Launch Shares in the company rose 6.6% on Tuesday in Helsinki following the announcement. Warren Buffett stocks are famous for tight focus. And this year, the famed investor's concentrated play","Intermediate-term momentum improving, analyst says. Bitcoin has been trading between $30,000 and $40,000 since late May"
160,BTC,"In this series of articles, BeinCrypto explores the state of various cryptocurrency ETFs in the United States. This particular article will focus on the Kryptoin bitcoin ETF, which was first introduced in 2019. Kryptoin’s bitcoin ETF has been in the news since 2019 and may find approval alongside the cryptocurrency market’s success in 2021. Institutional involvement in bitcoin and other cryptocurrencies has ramped up in recent years. The turning point for this change was introducing the world’s first bitcoin derivative products by Cboe and CME in late 2017. In the years between then and now, several other reputable names in finance have also joined the cryptocurrency bandwagon. These include Intercontinental Exchange, the company that operates the New York Stock Exchange (NYSE). Several crypto-oriented hedge funds have also emerged of late. The most notable being Grayscale Investments. A recent forecast estimated that hedge funds would increase its exposure to cryptocurrency to 7% within the next five years. However, the average individual isn’t likely to approach a hedge fund or similar high-profile investment offerings. Instead, they’re much more likely to invest in a retail or consumer-focused product, similar to how index funds work in the world of traditional finance. To that end, a handful of North American companies have been trying to launch their own Exchange Traded Funds (ETFs) that track bitcoin or ethereum’s price instead of a stock market index. The Kryptoin Bitcoin ETF is a financial product sponsored and launched by Kryptoin Investment Advisors LLC — a Delaware-based company. The firm is headed by Jason Toussaint, who was once the CEO of World Gold Trust Services. Most notably, the firm is the sponsor of the world’s largest Gold ETF, the SPDR Gold Shares (GLD). Toussaint has also taken up ETF and investment roles at Morgan Stanley and JP Morgan Asset Management in the past. If approved, Kryptoin’s ETF would be listed on the CBOE BZX exchange, according to the firm’s S-1 filing with the SEC on April 9. Notably, prior applications indicated that the firm was initially looking to be listed on the NYSE instead. According to a filing published by the United States Securities and Exchange Commission (SEC), the ETF will trade under the KBTC ticker symbol. In addition, the firm has stated that Gemini, the cryptocurrency exchange owned by the Winklevoss twins, will act as its custodian for any BTC acquired for the ETF. The Bank of Mellon","Kryptoin’s Bitcoin ETF has been in the news since 2019. If approved, Kryptoin’s ETF would be listed on the CBOE BZX"


# 5.Adding Sentiment Analysis

In [89]:
## seprating the stocks 

gme = df[df['StockOf']=='GME']
tesla = df[df['StockOf']=='TSLA']
btc = df[df['StockOf']=='BTC']

In [90]:
## downloading the sentiment analyser model from transformers

from transformers import pipeline
sentiment = pipeline('sentiment-analysis')
sentiment

Downloading:   0%|          | 0.00/629 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/268M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

<transformers.pipelines.text_classification.TextClassificationPipeline at 0x2843d888b08>

In [108]:

%%time

def getsentimentlabels(text):
    
    try:
        sents = sentiment(text)
        labels = sents[0]['label']
        return labels
    
    except:
        pass

def getsentimentscore(text):

    try:    
        sents = sentiment(text)
        score = sents[0]['score']
        return score

    except:
        pass


    

gme = df[df['StockOf']=='GME']
tesla = df[df['StockOf']=='TSLA']
btc = df[df['StockOf']=='BTC']

gme['Label'] = gme['Summary'].apply(lambda text:getsentimentlabels(text))
gme['Score'] = gme['Summary'].apply(lambda text:getsentimentscore(text))

tesla['Label'] = tesla['Summary'].apply(lambda text:getsentimentlabels(text))
tesla['Score'] = tesla['Summary'].apply(lambda text:getsentimentscore(text))


btc['Label'] = btc['Summary'].apply(lambda text:getsentimentlabels(text))
btc['Score'] = btc['Summary'].apply(lambda text:getsentimentscore(text))


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user

Wall time: 12.6 s


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


In [109]:
gme

Unnamed: 0,StockOf,StockText,Summary,Label,Score
0,GME,Investors in GameStop Corp. GME need to pay cl...,"Implied volatility is high for the Jul 16, 202...",NEGATIVE,0.972861
1,GME,The WallStreetBets bunch has set their sights ...,Retail investors are piling into meme stocks. ...,NEGATIVE,0.97252
2,GME,"Reddit investors pack a powerful punch, just a...",London-based White Square Capital to close dow...,NEGATIVE,0.999123
3,GME,In what has simultaneously been one of Wall St...,Geo Group may be Wall Street’s next big meme s...,POSITIVE,0.99526
4,GME,"NEW YORK, NY / ACCESSWIRE / July 16, 2021 / L...",,,
5,GME,"In this article, we discuss the 10 stocks Red...",,,
6,GME,(Corrects paragraph 5 to say Ryan Cohen is cha...,"Shares in IPO priced at A$17 each, above range...",POSITIVE,0.966807
7,GME,"“Oh people, look around you. The signs are eve...","Fed policy, stimulus checks, wages boosting st...",NEGATIVE,0.993299
8,GME,Not Found Details: cache-del21733-DEL 16265891...,Varnish cache server not found.,NEGATIVE,0.999381
9,GME,"Stocks jumped on Monday, with the three major ...",The Dow Jones Industrial Average had its best ...,NEGATIVE,0.981046


In [110]:
finallist = [gme,tesla,btc]
finaldf = pd.concat(finallist,axis = 0)
finaldf

Unnamed: 0,StockOf,StockText,Summary,Label,Score
0,GME,Investors in GameStop Corp. GME need to pay cl...,"Implied volatility is high for the Jul 16, 202...",NEGATIVE,0.972861
1,GME,The WallStreetBets bunch has set their sights ...,Retail investors are piling into meme stocks. ...,NEGATIVE,0.972520
2,GME,"Reddit investors pack a powerful punch, just a...",London-based White Square Capital to close dow...,NEGATIVE,0.999123
3,GME,In what has simultaneously been one of Wall St...,Geo Group may be Wall Street’s next big meme s...,POSITIVE,0.995260
4,GME,"NEW YORK, NY / ACCESSWIRE / July 16, 2021 / L...",,,
...,...,...,...,...,...
254,BTC,China is cracking down on Bitcoin. Blockchain....,"Peter Smith, Blockchain.com, says crackdown is...",POSITIVE,0.995479
255,BTC,In 2016 a month’s bitcoin mining with a home c...,In 2016 a month’s bitcoin mining with a home s...,NEGATIVE,0.998745
256,BTC,Bitcoin (BTC) is trading inside a range betwee...,,,
257,BTC,Bitcoin (BTC) broke down on June 22. falling t...,,,


#### Final Touches

In [116]:
finaldf.isnull().sum()

StockOf       0
StockText     0
Summary      69
Label        69
Score        69
dtype: int64

In [119]:
## ignoring the nan 

finaldf = finaldf.dropna()
finaldf.isnull().sum()

StockOf      0
StockText    0
Summary      0
Label        0
Score        0
dtype: int64

In [120]:
finaldf

Unnamed: 0,StockOf,StockText,Summary,Label,Score
0,GME,Investors in GameStop Corp. GME need to pay cl...,"Implied volatility is high for the Jul 16, 202...",NEGATIVE,0.972861
1,GME,The WallStreetBets bunch has set their sights ...,Retail investors are piling into meme stocks. ...,NEGATIVE,0.972520
2,GME,"Reddit investors pack a powerful punch, just a...",London-based White Square Capital to close dow...,NEGATIVE,0.999123
3,GME,In what has simultaneously been one of Wall St...,Geo Group may be Wall Street’s next big meme s...,POSITIVE,0.995260
6,GME,(Corrects paragraph 5 to say Ryan Cohen is cha...,"Shares in IPO priced at A$17 each, above range...",POSITIVE,0.966807
...,...,...,...,...,...
251,BTC,"BeinCrypto spoke to Alexander Höptner, CEO of ...",Hptner says he will use bitcoin in everyday li...,POSITIVE,0.982410
252,BTC,Bitcoin’s (BTC) price gave up some of its gain...,"Initial resistance seen around $36,000. Bitcoi...",NEGATIVE,0.999617
253,BTC,"As bitcoin nears support at $30,000, analysts ...","Delta Exchange sees a bounce to $40,000 in com...",NEGATIVE,0.998385
254,BTC,China is cracking down on Bitcoin. Blockchain....,"Peter Smith, Blockchain.com, says crackdown is...",POSITIVE,0.995479


## Where can we use this


**Basically we can run the script and get the summary of all the stock news which is going as of today with a sentiment of whether that data is positive or negative as we can observe this in the dataframe,and get to know heap lot of information regarding this!**

# 6.Exporting Results

In [121]:
finaldf.to_csv('finalsummaryreport.csv',index=None)