<H1>Queries & List of Query links</H1>

In [5]:
# Importing Libraries
from duckduckgo_search import DDGS
import requests
import pandas as pd
from bs4 import BeautifulSoup

In [6]:
# Queries (Added Extra Queries to search NASDAQ with ticker symbol GOEV)
queries = [
    "Identify the industry in which Canoo operates, along with its size, growth rate, trends, and key players",
    "Analyze Canoo's main competitors, including their market share, products or services offered, pricing strategies, and marketing efforts",
    "Identify key trends in the market, including changes in consumer behavior, technological advancements, and shifts in the competitive landscape",
    "Gather information on Canoo's financial performance, including its revenue, profit margins, return on investment, and expense structure."
    "Identify a publicly traded company listed on NASDAQ with ticker symbol GOEV"
]

# Initializing DDGS
ddgs = DDGS()

# Dictionary to store search results
search_results = {}

# Loop through each query
for query in queries:
    search_results[query] = []
    # Searching for 10 web links related to the necessary questions.
    results = ddgs.text(query, max_results=10)
    for result in results:
        # Checking for 'owler' in the link as per requirement
        if 'owler' not in result['href']:
            search_results[query].append(result['href'])

# Print the search results
for query, links in search_results.items():
    print(f"Results for: {query}")
    for link in links:
        print(link)
    print()


Results for: Identify the industry in which Canoo operates, along with its size, growth rate, trends, and key players
https://incfact.com/company/canoo-torrance-ca/
https://github.com/theSuriya/Canoo-INC-Analysis
https://en.wikipedia.org/wiki/Canoo
https://fortune.com/2022/08/29/canoo-ev-startup-scored-deal-worlds-largest-retailer/
https://investors.canoo.com/news-presentations/press-releases/detail/52/canoo-reports-fourth-quarter-and-full-year-2020-results
https://www.industryweek.com/leadership/companies-executives/article/21263239/canoo-sticks-to-2023-production-target
https://investors.canoo.com/news-presentations/press-releases/detail/75/canoo-increases-production-guidance-and-targets-for-us
https://www.reuters.com/business/autos-transportation/canoo-advances-manufacturing-dates-electric-vehicles-2021-11-15/
https://www.reuters.com/business/autos-transportation/canoo-production-starts-could-slip-ceo-remains-confident-funding-2022-05-18/
https://www.motortrend.com/news/canoo-techno

<H1>Scrape the data from web links & Store in Database(.csv)</H1>

In [7]:
# Initialize lists to store scraped data
queries = []
query_links = []
information = []

# Function to scrape website content
def scrape_website_content(url):
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
    }

    website = requests.get(url, headers=headers)
    if website.status_code == 200:
        soup = BeautifulSoup(website.content, 'html.parser')
        return ' '.join([i.text for i in soup.find_all('p')]).replace('\n', ' ').replace('\t', ' ').replace('\r', ' ')
    else:
        return None

# Loop through search results
for query, urls in search_results.items():
    for url in urls:
        print('Searching for:', query)
        print('URL:', url)

        # Scraping website content
        content = scrape_website_content(url)
        if content is not None:
            queries.append(query)
            query_links.append(url)
            information.append(content)
        else:
            print('Error: Failed to retrieve content from the website')
            continue

# Creating a DataFrame and added all the result in it
scraped_data = pd.DataFrame({'query': queries, 'query_link': query_links, 'information': information})

# Saving the Infomation in .csv file
scraped_data.to_csv('Scrape data from web links.csv', index=False)


Searching for: Identify the industry in which Canoo operates, along with its size, growth rate, trends, and key players
URL: https://incfact.com/company/canoo-torrance-ca/
Searching for: Identify the industry in which Canoo operates, along with its size, growth rate, trends, and key players
URL: https://github.com/theSuriya/Canoo-INC-Analysis
Searching for: Identify the industry in which Canoo operates, along with its size, growth rate, trends, and key players
URL: https://en.wikipedia.org/wiki/Canoo
Searching for: Identify the industry in which Canoo operates, along with its size, growth rate, trends, and key players
URL: https://fortune.com/2022/08/29/canoo-ev-startup-scored-deal-worlds-largest-retailer/
Searching for: Identify the industry in which Canoo operates, along with its size, growth rate, trends, and key players
URL: https://investors.canoo.com/news-presentations/press-releases/detail/52/canoo-reports-fourth-quarter-and-full-year-2020-results
Searching for: Identify the ind