# Tasks with solutions for the Startup Investments Database (Kaggle)

Link on dataset - https://www.kaggle.com/justinas/startup-investments

1. Show all records from the company table for companies that are closed.

In [None]:
SELECT *
FROM company
WHERE company.status = 'closed'

2. Show the amount of funds attracted for American news companies. Use data from the company table. Sort the table in descending order by the funding_total field.

In [None]:
SELECT c.funding_total
FROM company AS c
WHERE c.category_code = 'news'
    AND c.country_code = 'USA'
ORDER BY c.funding_total DESC

3. Find the total amount of deals where one company buys another company's shares for cash from 2011 to 2013 inclusive.

In [None]:

SELECT SUM(price_amount)
FROM acquisition AS a
WHERE a.term_code = 'cash'
    AND EXTRACT(YEAR FROM CAST(acquired_at AS timestamp)) BETWEEN 2011 AND 2013

4. Show the first name, last name, and network_username of people whose network_username starts with 'Silver'. 

In [None]:

SELECT first_name,
       last_name,
       network_username
FROM people
WHERE network_username LIKE 'Silver%'


5. Show all the information about people whose network_username fields contain the substring 'money', and whose last names start with 'K'.

In [None]:
SELECT *
FROM people
WHERE network_username LIKE '%money%'
    AND last_name LIKE 'K%'

6. For every country display the total amount of investments that have been received by companies registered in that country. The country in which the company is registered can be determined by the country code. Sort the data by descending amount.

In [None]:

SELECT country_code,
       SUM(funding_total)
FROM company AS c
GROUP BY country_code
ORDER BY SUM(funding_total) DESC

7. Make a table in which the date of the round is entered, as well as the minimum and maximum values of the amount of investments attracted on this date. Leave only those records in the final table in which the minimum value of the amount of investments is not zero and is not equal to the maximum value.

In [None]:
SELECT funded_at,
       MIN(raised_amount),
       MAX(raised_amount)
FROM funding_round
GROUP BY funded_at
HAVING MIN(raised_amount) != 0
  AND MIN(raised_amount) != MAX(raised_amount)

8. Create a New Field with Categories:  
• For Funds That Invest in 100 or More Companies, Assign the Category "high_activity".  
• For Funds That Invest in 20 to 100 Companies, Assign the Category "middle_activity".  
• If the Number of Invested Companies Is Less Than 20, Assign the Category "low_activity".  
Display All Table Fields and the New Category Field.

In [None]:
SELECT *,
       CASE
           WHEN invested_companies >= 100 THEN 'high_activity'
           WHEN invested_companies >= 20 AND invested_companies < 100
           THEN 'middle_activity'
           WHEN invested_companies < 20 THEN 'low_activity'
       END AS category
FROM fund

9. For each of the categories in the previous task, calculate the rounded to the nearest integer average number of investment rounds in which the fund took part. Display the categories and the average number of investment rounds on the screen. Sort the table in ascending order of the average.

In [None]:
SELECT
       CASE
           WHEN invested_companies>=100 THEN 'high_activity'
           WHEN invested_companies>=20 THEN 'middle_activity'
           ELSE 'low_activity'
       END AS activity,
       ROUND(AVG(investment_rounds)) AS avg_rounds
FROM fund
GROUP BY activity
ORDER BY avg_rounds

10.	Analyze in which countries the funds that most frequently invest in startups are located.
For each country, calculate the minimum, maximum, and average number of companies invested in by funds from that country, established between 2010 and 2012 inclusive. Exclude countries with funds where the minimum number of companies receiving investments is zero.
Export the ten most active investor countries: sort the table by the average number of companies from highest to lowest. Then add sorting by country code in lexicographic order.

In [None]:
SELECT country_code,
       MIN(invested_companies),
       MAX(invested_companies),
       AVG(invested_companies)
FROM fund
WHERE EXTRACT(YEAR FROM CAST(founded_at AS timestamp)) BETWEEN 2010 AND 2012
GROUP BY country_code
HAVING MIN(invested_companies) !=0
ORDER BY AVG(invested_companies) DESC, country_code
LIMIT 10

11.  Display the names and surnames of all startup employees. Add a field with the name of the educational institution that the employee graduated from, if this information is known.

In [None]:
SELECT first_name,
       last_name,
       instituition
FROM people AS p LEFT OUTER JOIN education As e ON p.id = e.person_id

12.  For each company, find the number of unique university names that its employees have graduated from. Display the company name and the number of unique university names. Make a top 5 of companies with most universities.

In [None]:
SELECT c.name,
       COUNT(DISTINCT e.instituition)
FROM company AS c LEFT OUTER JOIN people AS p ON c.id = p.company_id
                       INNER JOIN education AS e ON p.id = e.person_id
GROUP BY c.name
ORDER BY COUNT(DISTINCT e.instituition) DESC
LIMIT 5

13. Create a list of unique names of closed companies for which the first funding round was also the last.

In [None]:
SELECT DISTINCT c.name
FROM company AS c INNER JOIN funding_round AS fr ON c.id = fr.company_id
WHERE c.status = 'closed'
    AND fr.is_first_round = 1
    AND fr.is_last_round = 1

14.  Make a list of unique employee numbers of those who work in the companies selected in the previous task.

In [None]:
SELECT DISTINCT p.id
FROM people AS p INNER JOIN company AS c ON p.company_id = c.id
WHERE c.name in (
                SELECT DISTINCT c.name
                FROM company AS c INNER JOIN funding_round AS fr ON c.id = fr.company_id
                WHERE c.status = 'closed'
                    AND fr.is_first_round = 1
                    AND fr.is_last_round = 1)

15.  Make a table where the unique pairs of employee numbers from the previous task and the educational institution where the employee graduated will go.

In [None]:
SELECT DISTINCT person_id,
    		   instituition
FROM education
WHERE person_id IN (SELECT DISTINCT p.id
                                        FROM people AS p INNER JOIN company AS c ON p.company_id = c.id
                                       WHERE c.name in (
                                SELECT DISTINCT c.name
                   FROM company AS c INNER JOIN funding_round AS fr ON c.id =    
                   fr.company_id
                                WHERE c.status = 'closed'
                                    AND fr.is_first_round = 1
                                    AND fr.is_last_round = 1))

16.  Count the number of educational institutions for each employee from the previous task. When counting, take into account that some employees may have graduated from the same institution twice. (The condition from the second sentence is not checked!)

In [None]:
SELECT DISTINCT person_id,
             COUNT(instituition)
FROM education
WHERE person_id IN (SELECT DISTINCT p.id
                    FROM people AS p INNER JOIN company AS c ON p.company_id = c.id
                    WHERE c.name in (
                                SELECT DISTINCT c.name
                                FROM company AS c INNER JOIN funding_round AS fr ON c.id = fr.company_id
                                WHERE c.status = 'closed'
                                    AND fr.is_first_round = 1
                                    AND fr.is_last_round = 1))
GROUP BY person_id

17.  Add the previous query and display the average number of educational institutions (all, not unique), which graduates employees of different companies. You need to display only one record, grouping here is not needed.

In [None]:
WITH
a AS (SELECT DISTINCT person_id,
             COUNT(instituition) AS inst_count
FROM education
WHERE person_id IN (SELECT DISTINCT p.id
                    FROM people AS p INNER JOIN company AS c ON p.company_id = c.id
                    WHERE c.name in (
                                SELECT DISTINCT c.name
                                FROM company AS c INNER JOIN funding_round AS fr ON c.id = fr.company_id
                                WHERE c.status = 'closed'
                                    AND fr.is_first_round = 1
                                    AND fr.is_last_round = 1))
GROUP BY person_id)

SELECT SUM(a.inst_count) / COUNT(a.person_id)
FROM a

18.  Write a similar query: display the average number of educational institutions (all, not only unique), which graduates employees of Socialnet.

In [None]:
WITH a AS (SELECT  person_id,
             COUNT(instituition) AS inst_count
FROM education
GROUP BY person_id),

b AS (SELECT p.id AS pid
      FROM people AS p INNER JOIN company AS c ON p.company_id = c.id
      WHERE c.name = 'Socialnet')

SELECT SUM(a.inst_count) / COUNT(a.person_id)
FROM a INNER JOIN b ON a.person_id = b.pid

19.	Create a table with the following fields:  
•	name_of_fund — fund name;  
•	name_of_company — company name;  
•	amount — amount of investment the company raised in the round.  
The table will include data on companies that had more than six significant milestones in their history, with funding rounds occurring between 2012 and 2013 inclusive.

In [None]:
SELECT f.name AS name_of_fund,
       c.name AS name_of_company,
       fr.raised_amount AS amount
FROM fund AS f INNER JOIN investment As i ON f.id = i.fund_id
               INNER JOIN funding_round As fr ON i.funding_round_id = fr.id
               INNER JOIN company AS c ON fr.company_id = c.id
WHERE c.milestones > 6
    AND EXTRACT(YEAR FROM CAST(fr.funded_at AS timestamp)) BETWEEN 2012 AND 2013

20.	Export a table with the following fields:  
•	name of the acquiring company;  
•	deal amount;  
•	name of the acquired company;  
•	amount of investment made in the acquired company;  
•	ratio showing how many times the purchase amount exceeded the investment amount in the company, rounded to the nearest whole number.  
  
Exclude deals where the purchase amount is zero. If the investment amount in the company is zero, exclude that company from the table.  
Sort the table by deal amount from highest to lowest, and then by the name of the acquired company in lexicographic order. Limit the table to the first ten entries.

In [None]:
SELECT c_ing.name AS acquiring_company,
       a.price_amount,
       c_ed.name AS acquired_company,
       c_ed.funding_total,
       ROUND(a.price_amount / c_ed.funding_total)
FROM acquisition AS a INNER JOIN company AS c_ing ON a.acquiring_company_id = c_ing.id
                      INNER JOIN company AS c_ed ON a.acquired_company_id = c_ed.id
WHERE a.price_amount != 0
    AND c_ed.funding_total !=0
ORDER BY a.price_amount DESC, acquired_company
LIMIT 10

21.  Show the table with the names of companies with social category that received funding in 2010-2013. Check that the sum of investments is not zero. Also show the number of the month in which the round of financing took place.

In [None]:
SELECT c.name,
       EXTRACT(MONTH FROM CAST(fr.funded_at AS timestamp)) AS funding_month
FROM company AS c INNER JOIN funding_round AS fr ON c.id = fr.company_id
WHERE c.category_code = 'social'
    AND EXTRACT(YEAR FROM CAST(fr.funded_at AS timestamp)) BETWEEN 2010 AND 2013
    AND fr.raised_amount !=0

22. Select data for the months from 2010 to 2013 when funding rounds occurred. Group the data by month number and generate a table with the following fields:  
•	month number in which the rounds took place;  
•	number of unique fund names from the USA that invested in that month;  
•	number of companies acquired in that month;  
•	total deal amount for acquisitions in that month.

In [None]:

SELECT x.*,
       y.acquired_companies_count,
       y.sum_amount
FROM
  (SELECT EXTRACT(MONTH FROM CAST(fr.funded_at AS timestamp)) AS funding_month,
       COUNT(DISTINCT f.name) AS funds_count
   FROM funding_round AS fr INNER JOIN investment AS i ON fr.id = i.funding_round_id
                            INNER JOIN fund AS f ON i.fund_id = f.id
   WHERE f.country_code = 'USA'
       AND EXTRACT(YEAR FROM CAST(fr.funded_at AS timestamp)) BETWEEN 2010 AND 2013
   GROUP BY funding_month) AS x
LEFT OUTER JOIN
  (SELECT EXTRACT(MONTH FROM CAST(acquired_at AS timestamp)) AS acquired_month,
          COUNT(acquired_company_id) AS acquired_companies_count,
          SUM(price_amount) AS sum_amount
   FROM acquisition
   WHERE EXTRACT(YEAR FROM CAST(acquired_at AS timestamp)) BETWEEN 2010 AND 2013
   GROUP BY acquired_month
   ORDER BY acquired_month) AS y 
   ON x.funding_month = y.acquired_month

23.  Make a pivot table with the average amount of investments for countries with startups registered in 2011, 2012 and 2013. Data for each year should be in a separate field. Sort the table by the average value of investments in 2011 from highest to lowest.

In [None]:
WITH
     inv_2011 AS (SELECT country_code,
                         SUM(funding_total) / COUNT(funding_total) AS sum2011
                  FROM company AS c 
                  WHERE  EXTRACT(YEAR FROM CAST(c.founded_at AS timestamp)) = 2011
                  GROUP BY country_code), 
     inv_2012 AS (SELECT country_code,
                         SUM(funding_total) / COUNT(funding_total) AS sum2012
                  FROM company AS c 
                  WHERE  EXTRACT(YEAR FROM CAST(c.founded_at AS timestamp)) = 2012
                  GROUP BY country_code),
     inv_2013 AS (SELECT country_code,
                         SUM(funding_total) / COUNT(funding_total) AS sum2013
                  FROM company AS c 
                  WHERE  EXTRACT(YEAR FROM CAST(c.founded_at AS timestamp)) = 2013
                  GROUP BY country_code) 
     
     
SELECT inv_2011.country_code,
       inv_2011.sum2011,
       inv_2012.sum2012,
       inv_2013.sum2013
FROM inv_2011
INNER JOIN inv_2012
ON inv_2011.country_code = inv_2012.country_code
INNER JOIN inv_2013
ON inv_2011.country_code = inv_2013.country_code
ORDER BY inv_2011.sum2011 DESC