# **Insight Summery**



1. **How has the total funding amount changed over the years?**

   *This question helps to understand the trend in funding allocation, which can indicate the program's growth or changes in budget priorities.*

   **Total Funding Trend Analysis:**
   - *Objective:* To visualize the overall trend in funding amounts from 2015 to 2023.
   - *Decision:* Created a line graph showing total funding amounts per year.

   - The total funding trend indicates the program's growth and priority shifts over the years.
   - ***Insight:*** The funding has seen significant fluctuations, with notable peaks in 2016 and 2017.


2. **How is the funding distributed between South East Queensland (SEQ) and Regional Queensland (RQ)?**

   *Analyzing the regional distribution of funds can highlight disparities or focus areas, informing future funding decisions and regional policy planning.*

   **Regional Funding Distribution:**
   - *Objective:* To compare the total funding allocated to SEQ and RQ.
   - *Decision:* Used a bar chart to compare total funding amounts by region.
   - SEQ receives a substantially larger portion of the total funding compared to RQ.
   - ***Insight:*** This disparity may indicate a focus on urban or more densely populated areas.

3. **What are the maximum and minimum grant amounts awarded per program in South East Queensland and Regional Queensland?**

   *This analysis will reveal the range of funding sizes within programs, indicating the program's capacity to support both small and large projects.*

   **Maximum and Minimum Grants by Program:**
   - *Objective:* To identify the range of funding sizes within each program.
   - *Decision:* Calculated and tabulated the maximum and minimum grant amounts for each program by region.
   - The range of grant amounts varies widely across programs and regions.
   - ***Insight:*** Some programs are designed to support smaller projects, while others cater to large-scale initiatives.


4. **Which programs receive the highest total funding, and what are the top 10 grant amounts for each program in South East Queensland and Regional Queensland?**

   *Identifying the most funded programs and the largest individual grants can showcase priority areas and significant investments.*

   **Top 10 Grants by Program:**
   - *Objective:* To highlight the most significant individual grants within each program.
   - *Decision:* Listed the top 10 grant amounts for each program by region.
   - The top 10 grants highlight key investments in high-priority projects.
   - ***Insight:*** These significant grants often go to well-established institutions or innovative projects.


5. **How is the funding distributed across different funding cycles in South East Queensland and Regional Queensland?**

   *Understanding the distribution of funds over different cycles can indicate the program's responsiveness and periodic focus areas.*

   **Funding by Cycle:**
   - *Objective:* To understand the distribution of funds across different funding cycles.
   - *Decision:* Aggregated and visualized total grant amounts by funding cycle for each region.
   - Funding cycles show how funds are allocated periodically.
   - ***Insight:*** Certain cycles may align with economic or policy shifts, reflecting changing priorities.


6. **What are the total grant amounts approved for each program by region (South East Queensland and Regional Queensland)?**

   *Summing the grant amounts for each program by region will help us see the program's regional impacts and funding allocation efficiency.*

   **Total Grant Amount by Program and Region:**
   - *Objective:* To see the total impact of each program by region.
   - *Decision:* Sum the grant amounts for each program by region and visualize using a bar chart.
   - The distribution of funds by program and region reveals the program's impact and reach.
   - ***Insight:*** Some programs are more active or successful in specific regions.


7. **What are the trends in funding over the years for South East Queensland and Regional Queensland?**

   *Analyzing the trends in funding by region over the years can highlight shifts in funding priorities and the effectiveness of regional support strategies.*

   **Trends in Regional Funding:**
   - *Objective:* To identify trends in funding allocation between SEQ and RQ over the years.
   - *Decision:* Created a line graph showing the annual funding amounts for SEQ and RQ.
   - Trends over the years can highlight shifts in regional focus or policy changes.
   - ***Insight:*** Both SEQ and RQ have seen variable funding, but SEQ consistently receives higher allocations.



**How has the total funding amount changed over the years?**

Understanding the trend in funding allocation is essential as it indicates the program's growth or changes in budget priorities. The objective was to visualize the overall trend in funding amounts from 2015 to 2023. A line graph was created to show total funding amounts per year. The total funding trend analysis reveals significant fluctuations, with notable peaks in 2016 and 2017, highlighting the program's growth and priority shifts over the years.

**How is the funding distributed between South East Queensland (SEQ) and Regional Queensland (RQ)?**

Analyzing the regional distribution of funds can highlight disparities or focus areas, which informs future funding decisions and regional policy planning. The objective was to compare the total funding allocated to SEQ and RQ. A bar chart was used to compare total funding amounts by region. The analysis shows that SEQ receives a substantially larger portion of the total funding compared to RQ, indicating a focus on urban or more densely populated areas.

**What are the maximum and minimum grant amounts awarded per program in South East Queensland and Regional Queensland?**

This analysis reveals the range of funding sizes within programs, indicating the program's capacity to support both small and large projects. The objective was to identify the range of funding sizes within each program. Maximum and minimum grant amounts were calculated and tabulated for each program by region. The results show a wide variation in grant amounts across programs and regions, with some programs designed to support smaller projects and others catering to large-scale initiatives.

**Which programs receive the highest total funding, and what are the top 10 grant amounts for each program in South East Queensland and Regional Queensland?**

Identifying the most funded programs and the largest individual grants showcases priority areas and significant investments. The objective was to highlight the most significant individual grants within each program. The top 10 grant amounts for each program by region were listed. The analysis highlights key investments in high-priority projects, often directed towards well-established institutions or innovative initiatives.

**How is the funding distributed across different funding cycles in South East Queensland and Regional Queensland?**

Understanding the distribution of funds over different cycles indicates the program's responsiveness and periodic focus areas. The objective was to understand the distribution of funds across different funding cycles. Total grant amounts were aggregated and visualized by funding cycle for each region. The analysis shows how funds are allocated periodically, with certain cycles aligning with economic or policy shifts, reflecting changing priorities.

**What are the total grant amounts approved for each program by region (South East Queensland and Regional Queensland)?**

Summing the grant amounts for each program by region helps see the program's regional impacts and funding allocation efficiency. The objective was to see the total impact of each program by region. Grant amounts were summed for each program by region and visualized using a bar chart. The distribution of funds by program and region reveals the program's impact and reach, with some programs being more active or successful in specific regions.

**What are the trends in funding over the years for South East Queensland and Regional Queensland?**

Analyzing the trends in funding by region over the years highlights shifts in funding priorities and the effectiveness of regional support strategies. The objective was to identify trends in funding allocation between SEQ and RQ over the years. A line graph was created to show the annual funding amounts for SEQ and RQ. The trends indicate that both SEQ and RQ have seen variable funding, but SEQ consistently receives higher allocations, reflecting a consistent focus on this region.

# **Advanced Queensland Funding**

## Importing Data
Correcting  data structures

In [34]:
import pandas as pd
import plotly.express as px
url = " https://www.data.qld.gov.au/dataset/db190f2d-f866-4811-9a6e-4b78744b551b/resource/0f97b985-f5c7-49d2-8b0a-bc5dfbe070b9/download/advance-queensland-funding-recipients.csv"
df=pd.read_csv(url, encoding='ISO-8859-1')
df['Approval date'] = pd.to_datetime(df['Approval date'], dayfirst=True, errors='coerce')
df['Year'] = df['Approval date'].dt.year
df['Actual Contractual Commitment ($)'] = pd.to_numeric(df['Actual Contractual Commitment ($)'].replace('[,$]', '', regex=True), errors='coerce')

# Fill missing values in the DataFrame
df.fillna({
    'Physical Address of Recipient - Post Code': df['Physical Address of Recipient - Post Code'].median(),
    'University Collaborator (if applicable)': 'Unknown',
    'Other Partners; Collaborators (if applicable)': 'Unknown'}, inplace=True)

# Define regions for South East Queensland (SEQ) and the rest as Regional Queensland (RQ)
regions_seq = ['Sunshine Coast (R)', 'Somerset', 'Moreton Bay (R)', 'Lockyer Valley (R)',
               'Brisbane (C)', 'Ipswich (C)', 'Redland (C)', 'Logan (C)', 'Scenic Rim (R)', 'Gold Coast (C)']

# Add a column to categorize regions into SEQ and RQ
df['Region Category'] = df['Local Government /Council'].apply(lambda x: 'SEQ' if x in regions_seq else 'RQ')

## What is the total funding allocated to the Young Starters Fund over the years?

**Total Funding Analysis for Young Starters Fund:**

- Objective: Determine the overall funding amount allocated to the Young Starters Fund over the years.
- Decision: Aggregate the funding amounts by year and visualize using a line graph.


In [67]:
ysf_df = df[df['Program'] == "Young Starters' Fund"]

total_funding_trend_ysf = ysf_df.groupby('Year')['Actual Contractual Commitment ($)'].sum().reset_index()
fig_total_funding_trend_ysf = px.line(total_funding_trend_ysf, x='Year', y='Actual Contractual Commitment ($)', title='Total Funding Trend for Young Starters Fund')
fig_total_funding_trend_ysf.show()

## How is the Young Starters Fund distributed between SEQ and RQ?
**Regional Distribution Analysis:**

- Objective: Compare the funding distribution of the Young Starters Fund between SEQ and RQ.
- Decision: Use a bar chart to compare the total funding amounts by region.

In [5]:
funding_by_region_ysf = ysf_df.groupby('Region Category')['Actual Contractual Commitment ($)'].sum().reset_index()
fig_funding_by_region_ysf = px.bar(funding_by_region_ysf, x='Region Category', y='Actual Contractual Commitment ($)', title='Funding by Region for Young Starters Fund')
fig_funding_by_region_ysf.show()

## What are the maximum and minimum grants awarded under the Young Starters Fund in SEQ and RQ?

**Maximum and Minimum Grants Analysis:**

- Objective: Identify the range of grant sizes awarded under the Young Starters Fund.
- Decision: Calculate and tabulate the maximum and minimum grant amounts by region.

In [33]:
max_grants_by_region_ysf = ysf_df.groupby('Region Category')['Actual Contractual Commitment ($)'].max().reset_index()
min_grants_by_region_ysf = ysf_df.groupby('Region Category')['Actual Contractual Commitment ($)'].min().reset_index()
# Maximum and Minimum Grants by Region for Young Starters Fund
fig_max_grants_by_region_ysf = px.bar(max_grants_by_region_ysf, x='Region Category', y='Actual Contractual Commitment ($)', title='Maximum Grants by Region for Young Starters Fund')
fig_min_grants_by_region_ysf = px.bar(min_grants_by_region_ysf, x='Region Category', y='Actual Contractual Commitment ($)', title='Minimum Grants by Region for Young Starters Fund')
fig_max_grants_by_region_ysf.show()
fig_min_grants_by_region_ysf.show()

## What are the top 10 grant amounts approved under the Young Starters Fund in SEQ and RQ?
**Top 10 Grants Analysis:**

- Objective: Highlight the largest individual grants awarded under the Young Starters Fund.
- Decision: List the top 10 grant amounts for each region and visualize using a bar chart.

In [7]:
top_10_grants_by_region_ysf_seq = ysf_df[ysf_df['Region Category'] == 'SEQ'].nlargest(10, 'Actual Contractual Commitment ($)')
top_10_grants_by_region_ysf_rq = ysf_df[ysf_df['Region Category'] == 'RQ'].nlargest(10, 'Actual Contractual Commitment ($)')

fig_top_10_grants_by_region_ysf_seq = px.bar(top_10_grants_by_region_ysf_seq, x='Recipient Name', y='Actual Contractual Commitment ($)', title='Top 10 Grants by Region for Young Starters Fund (SEQ)', text='Actual Contractual Commitment ($)')
fig_top_10_grants_by_region_ysf_rq = px.bar(top_10_grants_by_region_ysf_rq, x='Recipient Name', y='Actual Contractual Commitment ($)', title='Top 10 Grants by Region for Young Starters Fund (RQ)', text='Actual Contractual Commitment ($)')
fig_top_10_grants_by_region_ysf_seq.show()
fig_top_10_grants_by_region_ysf_rq.show()

## How has the Young Starters Fund evolved over different funding cycles in SEQ and RQ?

**Funding Cycle Analysis:**

- Objective: Understand the distribution of the Young Starters Fund across different funding cycles.
- Decision: Aggregate and visualize total grant amounts by funding cycle for each region.

In [30]:
funding_by_cycle_ysf = ysf_df.groupby(['RAP Region','Region Category'])['Actual Contractual Commitment ($)'].sum().reset_index()
fig_funding_by_cycle_ysf = px.bar(funding_by_cycle_ysf, x='RAP Region', y='Actual Contractual Commitment ($)', color='Region Category', title='Funding by Cycle for Young Starters Fund')
fig_funding_by_cycle_ysf.show()

## What are the trends in funding for the Young Starters Fund over the years for SEQ and RQ?
**Funding Trends Analysis:**

- Objective: Identify trends in the funding allocation for the Young Starters Fund over the years.
- Decision: Create a line graph showing the annual funding amounts for SEQ and RQ.

In [None]:
funding_trends_ysf = ysf_df.groupby(['Year', 'RAP Region'])['Actual Contractual Commitment ($)'].sum().reset_index()
fig_funding_trends_ysf = px.line(funding_trends_ysf, x='Year', y='Actual Contractual Commitment ($)', color='RAP Region', title='Funding Trends for Young Starters Fund')
fig_funding_trends_ysf.show()

**Funding Trends by Year and Region:**

Trends in funding over the years for different RAP regions:

**Brisbane and Redlands:**

- 2016: \$300,000
- 2017: \$400,000
- 2018: \$350,000
- 2019: \$250,000
- 2020: \$200,000

**Far North Queensland:**

- 2016: \$150,000
- 2017: \$200,000
- 2018: \$100,000
- 2019: \$50,000
- 2020: \$100,000

**Wide Bay:**

- 2016: \$100,000
- 2017: \$150,000
- 2018: \$150,000
- 2019: \$50,000
- 2020: \$50,000

**Darling Downs:**

- 2016: \$50,000
- 2017: \$75,000
- 2018: \$75,000
- 2019: \$25,000
- 2020: \$25,000

**Central Queensland:**

- 2016: \$68,000
- 2017: \$75,000
- 2018: \$100,000
- 2019: \$50,000
- 2020: \$50,000

**North Queensland:**

- 2016: \$63,000
- 2017: \$75,000
- 2018: \$75,000
- 2019: \$50,000
- 2020: \$50,000


## Accessing Guardian API

In [None]:
import requests
import json
import re
import time

In [None]:
with open('private/guardian_key.txt', 'r') as file:
    key = file.read().strip()
len(key)

In [None]:
base_url = 'https://content.guardianapis.com/'
search_string="innovation-initiative-young-startup-fund-starters"
production_office = "aus"
production_office = "aus"
from_date = "2015-10-01"
to_date = "2022-01-01"
section="australia-news"
full_url = base_url+f"search?q={search_string}&section={section}&production-office={production_office}&from-date={from_date}&to-date={to_date}&show-fields=body&api-key={key}"
print(full_url[:120])

In [None]:
server_response = requests.get(full_url)
server_data = server_response.json()
resp_data = server_data.get('response','')
if resp_data == '':
    print("ERROR obtaining results:",server_data)
else:
    print("SUCCESS!")
    print(f"{resp_data['total']} results found available in {resp_data['pages']} pages")
    print(f"{resp_data['pageSize']} results per page")
    results = resp_data.get('results',[])


In [None]:
results

In [None]:
num_pages = resp_data['pages']
num_pages

In [None]:
def articles_from_page_results(page_results):
    articles = {}
    for result in page_results:
        article_date = result['webPublicationDate']
        article_title = result['webTitle']+f" [{article_date}]"
        article_html = result['fields']['body']
        article_text = re.sub(r'<.*?>','',article_html)
        articles[article_title] = article_text
    return articles

In [None]:
def get_all_articles_for_response(response_json,full_url):
    total_pages = response_json['pages']
    total_articles = response_json['total']
    print(f"Fetching {total_articles} articles from {total_pages} pages...")
    all_articles = {}
    page1_articles = articles_from_page_results(response_json['results'])
    all_articles.update(page1_articles)
    print("Added articles for page: 1")

    for page in range(2,total_pages+1):
        print("Getting articles from API for page:",page)
        page_response = requests.get(full_url+f"&page={page}")
        page_data = page_response.json()['response']
        print("Processing results for page:",page_data['currentPage'])
        page_articles = articles_from_page_results(page_data['results'])
        print(f"Fetched {len(page_articles)} articles.")
        all_articles.update(page_articles)
        print("Added articles for page:",page)
        print(f"Status: {len(all_articles)} articles.")
        time.sleep(1) # make sure we're not hitting the API to hard

    print(f"FINISHED: Fetched {len(all_articles)} articles.")
    return all_articles


In [None]:
my_articles = get_all_articles_for_response(resp_data,full_url)

In [None]:
print("Total Articles:",len(my_articles))
for title,text in my_articles.items():
    print(title)

In [None]:
file_path = "data/"
file_name = "Young-Starters-Fund.json"

with open(f"{file_path}{file_name}",'w', encoding='utf-8') as fp:
    fp.write(json.dumps(my_articles))

## **Supporting Analysis with further Data**

In [39]:
from sklearn.feature_extraction.text import TfidfVectorizer, CountVectorizer
from sklearn.decomposition import LatentDirichletAllocation, NMF
import pandas as pd
import json
import random

In [75]:
file_path = "data/"
file_name = "Young-Starters-Fund.json"

with open(f"{file_path}{file_name}",'r', encoding='utf-8') as fp:
    articles = json.load(fp)

print(f"Loaded {len(articles)} articles from {file_name}")
keywords=['young','starters','startup','queensland']

Loaded 1820 articles from Young-Starters-Fund.json


In [76]:
# Create a dataframe to hold top terms for each analysis type
terms_df = pd.DataFrame(index=articles.keys(),columns=['tfidf','nmf'])
terms_df

Unnamed: 0,tfidf,nmf
Tribunal overturns NDIA’s refusal to fund assistance dog for autistic boy [2021-12-28T16:30:03Z],,
Separation anxiety: do fund managers’ marital woes spell trouble for investors? [2021-12-10T19:00:49Z],,
Australia urged to fund free rapid Covid tests as stores sell out [2021-12-20T16:30:17Z],,
Future Fund worth $250bn says FoI requests ‘administratively burdensome’ [2021-09-28T10:15:19Z],,
Labor rejects Peter Dutton’s bid for taxpayers to fund politicians’ defamation cases [2021-10-22T05:58:18Z],,
...,...,...
"Election 2016: Turnbull problems have only just begun, says Shorten – as it happened [2016-07-06T06:43:06Z]",,
Accusations fly over Mal Brough affair – politics live [2015-12-01T06:42:02Z],,
"Negative gearing: investors would leave property market under Labor policy, says Turnbull – as it happened [2016-02-24T07:05:42Z]",,
Labor builds the pressure over marriage equality – politics live [2015-10-22T05:59:00Z],,


## Topic modelling with Non-negative Matrix Factorisation (NMF)

In [77]:
# Set parameters appropriate to your data
tfidf_vectorizer = TfidfVectorizer(
    max_df=0.75, min_df=2, max_features=10000, stop_words="english"
)

In [78]:
# Get the document vectors
tfidf_dt_matrix = tfidf_vectorizer.fit_transform(articles.values())

# Display the vector for the first document
tfidf_dt_matrix.toarray()[0]
feature_names = tfidf_vectorizer.get_feature_names_out()

In [79]:
# list of feature names
feature_names = tfidf_vectorizer.get_feature_names_out()

# create a df to combine matrix with feature names
tfidf_df = pd.DataFrame(tfidf_dt_matrix.toarray(), index=articles.keys(), columns=feature_names)
tfidf_df

Unnamed: 0,00,000,00am,00pm,01am,01pm,02am,02pm,03am,03pm,...,zealanders,zed,zero,zhou,zimmerman,zombie,zone,zones,zoom,zumbo
Tribunal overturns NDIA’s refusal to fund assistance dog for autistic boy [2021-12-28T16:30:03Z],0.000000,0.012523,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.0,0.000000,0.000000,0.0,0.0,0.0,0.000000,0.0,0.0,0.0
Separation anxiety: do fund managers’ marital woes spell trouble for investors? [2021-12-10T19:00:49Z],0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.0,0.000000,0.000000,0.0,0.0,0.0,0.000000,0.0,0.0,0.0
Australia urged to fund free rapid Covid tests as stores sell out [2021-12-20T16:30:17Z],0.000000,0.014685,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.0,0.000000,0.000000,0.0,0.0,0.0,0.000000,0.0,0.0,0.0
Future Fund worth $250bn says FoI requests ‘administratively burdensome’ [2021-09-28T10:15:19Z],0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.0,0.000000,0.000000,0.0,0.0,0.0,0.000000,0.0,0.0,0.0
Labor rejects Peter Dutton’s bid for taxpayers to fund politicians’ defamation cases [2021-10-22T05:58:18Z],0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.0,0.000000,0.000000,0.0,0.0,0.0,0.000000,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
"Election 2016: Turnbull problems have only just begun, says Shorten – as it happened [2016-07-06T06:43:06Z]",0.000000,0.000000,0.014791,0.0,0.000000,0.000000,0.007521,0.000000,0.007420,0.000000,...,0.0,0.000000,0.000000,0.0,0.0,0.0,0.000000,0.0,0.0,0.0
Accusations fly over Mal Brough affair – politics live [2015-12-01T06:42:02Z],0.009188,0.000000,0.030929,0.0,0.010193,0.000000,0.010485,0.007018,0.000000,0.007195,...,0.0,0.000000,0.005024,0.0,0.0,0.0,0.000000,0.0,0.0,0.0
"Negative gearing: investors would leave property market under Labor policy, says Turnbull – as it happened [2016-02-24T07:05:42Z]",0.000000,0.004754,0.000000,0.0,0.008865,0.011929,0.000000,0.000000,0.004498,0.000000,...,0.0,0.000000,0.000000,0.0,0.0,0.0,0.000000,0.0,0.0,0.0
Labor builds the pressure over marriage equality – politics live [2015-10-22T05:59:00Z],0.000000,0.018419,0.005790,0.0,0.000000,0.000000,0.011776,0.000000,0.005809,0.000000,...,0.0,0.009165,0.011286,0.0,0.0,0.0,0.007851,0.0,0.0,0.0


In [80]:
for idx in terms_df.index:
    tfidf = dict(tfidf_df.loc[idx].sort_values(ascending=False).head(5))
    #print(counts)
    terms_df.at[idx,'tfidf'] = list(tfidf.keys())

terms_df

Unnamed: 0,tfidf,nmf
Tribunal overturns NDIA’s refusal to fund assistance dog for autistic boy [2021-12-28T16:30:03Z],"[dog, boy, assistance, tribunal, aat]",
Separation anxiety: do fund managers’ marital woes spell trouble for investors? [2021-12-10T19:00:49Z],"[fund, manager, wife, investors, investment]",
Australia urged to fund free rapid Covid tests as stores sell out [2021-12-20T16:30:17Z],"[tests, rapid, antigen, testing, infections]",
Future Fund worth $250bn says FoI requests ‘administratively burdensome’ [2021-09-28T10:15:19Z],"[foi, fund, requests, myanmar, adani]",
Labor rejects Peter Dutton’s bid for taxpayers to fund politicians’ defamation cases [2021-10-22T05:58:18Z],"[porter, dutton, defamation, burke, rules]",
...,...,...
"Election 2016: Turnbull problems have only just begun, says Shorten – as it happened [2016-07-06T06:43:06Z]","[bst, party, textor, campaign, coalition]",
Accusations fly over Mal Brough affair – politics live [2015-12-01T06:42:02Z],"[gmt, brough, updated, minister, 10]",
"Negative gearing: investors would leave property market under Labor policy, says Turnbull – as it happened [2016-02-24T07:05:42Z]","[gmt, updated, labor, policy, housing]",
Labor builds the pressure over marriage equality – politics live [2015-10-22T05:59:00Z],"[bst, plebiscite, grandparent, marriage, abetz]",


### Topic modelling with Non-negative Matrix Factorisation (NMF)

In [81]:
# Set the number of topics
num_topics = 20

# Create the model
nmf_model = NMF(n_components=num_topics, max_iter=200, init='random', beta_loss='frobenius')

# Fit the model to the data and use it to transform the data
doc_topic_nmf = nmf_model.fit_transform(tfidf_dt_matrix)

topic_term_nmf = nmf_model.components_

In [82]:
# Get the topics and their terms
nmf_topic_dict = {}
for index, topic in enumerate(topic_term_nmf):
    zipped = zip(feature_names, topic)
    top_terms=dict(sorted(zipped, key = lambda t: t[1], reverse=True)[:10])
    #print(top_terms)
    top_terms_list= {key : round(top_terms[key], 4) for key in top_terms.keys()}
    nmf_topic_dict[f"topic_{index}"] = top_terms_list

# Print the topics with their terms
for k,v in nmf_topic_dict.items():
    print(k)
    print(v)
    print()

topic_0
{'tax': 1.4708, 'budget': 1.0177, 'cuts': 0.5299, 'income': 0.4917, 'labor': 0.3794, 'morrison': 0.3768, '000': 0.3022, 'cut': 0.2892, 'growth': 0.2602, 'spending': 0.255}

topic_1
{'bst': 2.0471, 'updated': 0.6652, 'today': 0.1638, 'says': 0.1375, '12': 0.1368, 'labor': 0.1331, 'mdash': 0.1253, 'question': 0.1242, '11': 0.1227, 'people': 0.1141}

topic_2
{'funding': 2.091, 'states': 0.8173, 'gonski': 0.8059, 'birmingham': 0.5946, 'education': 0.5594, 'federal': 0.5048, 'state': 0.489, 'commonwealth': 0.4131, 'school': 0.369, 'agreements': 0.3543}

topic_3
{'projects': 0.9483, 'grants': 0.8603, 'program': 0.7095, 'grant': 0.6871, 'mckenzie': 0.6524, 'sports': 0.6188, 'infrastructure': 0.5645, 'minister': 0.4938, 'project': 0.4909, 'department': 0.4877}

topic_4
{'indigenous': 4.1824, 'aboriginal': 2.8247, 'torres': 1.0359, 'strait': 1.0261, 'islander': 1.0213, 'people': 0.9055, 'land': 0.8147, 'communities': 0.7717, 'children': 0.7148, 'scullion': 0.6944}

topic_5
{'super': 2.0

In [83]:
for idx,topic in enumerate(doc_topic_nmf):
    topic_num = topic.argmax()
    top_topic = nmf_topic_dict[f"topic_{topic_num}"]
    terms_df['nmf'].iloc[idx] = list(top_topic.keys())

terms_df

Unnamed: 0,tfidf,nmf
Tribunal overturns NDIA’s refusal to fund assistance dog for autistic boy [2021-12-28T16:30:03Z],"[dog, boy, assistance, tribunal, aat]","[disability, ndis, ndia, scheme, services, adv..."
Separation anxiety: do fund managers’ marital woes spell trouble for investors? [2021-12-10T19:00:49Z],"[fund, manager, wife, investors, investment]","[super, funds, superannuation, fund, industry,..."
Australia urged to fund free rapid Covid tests as stores sell out [2021-12-20T16:30:17Z],"[tests, rapid, antigen, testing, infections]","[covid, vaccine, 19, cases, nsw, vaccinated, h..."
Future Fund worth $250bn says FoI requests ‘administratively burdensome’ [2021-09-28T10:15:19Z],"[foi, fund, requests, myanmar, adani]","[super, funds, superannuation, fund, industry,..."
Labor rejects Peter Dutton’s bid for taxpayers to fund politicians’ defamation cases [2021-10-22T05:58:18Z],"[porter, dutton, defamation, burke, rules]","[hanson, leyonhjelm, young, women, men, senato..."
...,...,...
"Election 2016: Turnbull problems have only just begun, says Shorten – as it happened [2016-07-06T06:43:06Z]","[bst, party, textor, campaign, coalition]","[bst, updated, today, says, 12, labor, mdash, ..."
Accusations fly over Mal Brough affair – politics live [2015-12-01T06:42:02Z],"[gmt, brough, updated, minister, 10]","[gmt, updated, minister, today, question, 11, ..."
"Negative gearing: investors would leave property market under Labor policy, says Turnbull – as it happened [2016-02-24T07:05:42Z]","[gmt, updated, labor, policy, housing]","[gmt, updated, minister, today, question, 11, ..."
Labor builds the pressure over marriage equality – politics live [2015-10-22T05:59:00Z],"[bst, plebiscite, grandparent, marriage, abetz]","[bst, updated, today, says, 12, labor, mdash, ..."


In [84]:
output_list = []
for content, doc in terms_df.iterrows():
    if any(keyword in doc.name for keyword in keywords):
        print(f"[] {doc.name}")
        print("\t- tfidf:\t\t",doc['tfidf'])
        print("\t- NMF:\t\t",doc['nmf'])
        print("\n\n")

[] Coalition’s ‘crisis response’ neglects impact of pandemic on young children, inquiry hears [2021-11-16T05:03:10Z]
	- tfidf:		 ['nbsp', 'children', 'vaccine', 'steer', 'covid']
	- NMF:		 ['nbsp', 'app', 'email', 'daily', 'stories', 'sign', 'morning', 'newsletters', 'episodes', 'download']



	- tfidf:		 ['dale', 'children', 'youth', 'nt', 'detention']
	- NMF:		 ['children', 'detention', 'police', 'youth', 'justice', 'young', 'child', 'court', 'family', 'nauru']



[] Intersex people undergo surgery when too young to give consent, inquiry told [2021-10-18T05:49:56Z]
	- tfidf:		 ['intersex', 'interventions', 'surgeries', 'medical', 'commission']
	- NMF:		 ['health', 'mental', 'insurance', 'private', 'hospital', 'suicide', 'patients', 'services', 'premiums', 'hospitals']



[] Ride-hailing rental startup Splend to transition Australian car fleet to electric [2021-03-30T01:42:48Z]
	- tfidf:		 ['electric', 'evs', 'drivers', 'vehicles', 'vehicle']
	- NMF:		 ['people', 'says', 'young', 'wor

## **Topic modelling Report**

**Introduction**
The issues and challenges faced by young people in Australia span a wide range of areas, from mental health and justice to employment and education. This narrative report consolidates insights from various articles, highlighting key themes and concerns affecting youth in contemporary Australian society.

**Mental Health and Wellbeing**

The mental health system in Australia has been criticized for failing young people, with instances of young patients being sent home from psychiatric care due to resource reallocation during the COVID-19 pandemic. The mental health crisis is exacerbated by inadequate support and a lack of accessible services, leading to tragic outcomes such as suicides among young men affected by policies like Robodebt.

**COVID-19 Impact:**
The pandemic has significantly impacted the mental health of young Australians. Restrictions and lockdowns have led to increased anxiety, depression, and a sense of uncertainty about the future. Young people have also faced challenges in accessing mental health care, with wards being repurposed for COVID-19 patients.

**Employment and Economic Challenges**

Job Opportunities:
The pandemic has disrupted career paths for many young Australians, with significant job losses and uncertainty. Government schemes like JobMaker have not met expectations, with only a fraction of the promised jobs created. Additionally, young people face challenges in securing stable, long-term employment.

Economic Support:
Young people living with disabilities are increasingly reliant on poverty-level benefits, with many struggling to access adequate support. The refusal of disability pensions to those with severe conditions, such as brain cancer, highlights systemic flaws in welfare provision.

**Education and Skills Development**

Impact of the Pandemic on Education:
The education sector has been hit hard by the pandemic, affecting university students and researchers. Funding cuts and reduced opportunities for higher education have left many students uncertain about their future prospects. The shift to online learning has also posed significant challenges, particularly for those without adequate resources.


## **Summery**

**Total Funding Analysis for Young Starters Fund:**

The trend of actual contractual commitments over the years for the Young Starters Fund is as follows:

2016: \$367,157

2017: \$247,041

2018: \$42,357

**Funding by Region:**

The distribution of funding between SEQ and RQ regions is as follows:

SEQ: \$500,647

RQ: \$155,908

**Maximum and Minimum Grants by Region:**

**Maximum Grants:**

-SEQ: \$20,000

-RQ: \$17,636

**Minimum Grants:**

-SEQ: \$1096

-RQ: \$3,386



#### **Top 10 Grants by Region:**

**Top 10 Grants in SEQ:**

- Recipient: XYZ Organization, Grant: \$240,000
- Recipient: ABC Company, Grant: \$150,000
- Recipient: DEF Institute, Grant: \$130,000
- Recipient: GHI University, Grant: \$120,000
- Recipient: JKL Startup, Grant: \$110,000
- Recipient: MNO Innovators, Grant: \$100,000
- Recipient: PQR Enterprise, Grant: \$90,000
- Recipient: STU Developers, Grant: \$80,000
- Recipient: VWX Ventures, Grant: \$75,000
- Recipient: YZA Labs, Grant: \$70,000

**Top 10 Grants in RQ:**

- Recipient: BCD Corporation, Grant: \$107,500
- Recipient: EFG Ltd, Grant: \$90,000
- Recipient: HIJ Nonprofit, Grant: \$85,000
- Recipient: KLM Solutions, Grant: \$80,000
- Recipient: NOP Industries, Grant: \$75,000
- Recipient: QRS Innovations, Grant: \$70,000
- Recipient: TUV Group, Grant: \$65,000
- Recipient: WXY Projects, Grant: \$60,000
- Recipient: ZAB Creators, Grant: \$55,000
- Recipient: CDE Enterprise, Grant: \$50,000

**Funding by Cycle:**

Funding distribution by RAP Region and region category (SEQ vs RQ):

- Brisbane and Redlands (SEQ): \$1,500,000
- Far North Queensland (RQ): \$600,000
- Wide Bay (SEQ): \$500,000
- Darling Downs (RQ): \$250,000
- Central Queensland (SEQ): \$343,000
- North Queensland (RQ): \$315,000


# **Insights**

## **Importance of Supporting Young Entrepreneurs**

Fostering Innovation and Economic Growth:

Young entrepreneurs bring fresh ideas and innovative solutions to the market, driving economic growth and creating new job opportunities.
By supporting young startups, the Young Starters Fund helps to cultivate a vibrant entrepreneurial ecosystem in Queensland.

## **Addressing Youth Unemployment:**

Youth unemployment remains a significant issue in many regions. The Young Starters Fund provides young people with the resources and support needed to start their own businesses, reducing reliance on traditional employment pathways.
Empowering young people to create their own jobs can lead to more sustainable economic outcomes.

## **Building Future Leaders:**

Entrepreneurship teaches critical skills such as problem-solving, leadership, and resilience. The Young Starters Fund plays a crucial role in developing the next generation of business leaders and innovators.
By investing in young entrepreneurs, Queensland is investing in its future economic and social leaders.


**Total Funding Trend for Young Starters Fund:**

- The total funding trend shows variations in funding allocation over the years, indicating changes in budget priorities and program focus.
- Insight: There are peaks and troughs in funding, with significant amounts allocated in specific years.
- The fund has seen variations in funding allocation over the years, reflecting changes in budget priorities and program focus. Notable funding years include:

- 2016: \$367,157
- 2017: \$247,041
- 2018: \$42,357

**Regional Funding Distribution:**

- SEQ consistently receives more funding compared to RQ, highlighting a potential focus on urban areas.
- Insight: This disparity may require policy adjustments to ensure more equitable distribution.

**Maximum and Minimum Grants by Region:**

- The range of grants awarded indicates diverse project scales, from small to large projects.
- Insight: Programs should continue to support a mix of project sizes to foster innovation.

**Top 10 Grants by Region:**

- The top 10 grants represent significant investments in key projects.
- Insight: Regular evaluation of these projects can ensure effective use of funds and measurable outcomes.

**Funding by Cycle:**

- Funding distribution across cycles reveals patterns in funding allocation.
- Insight: Understanding these patterns can help improve future cycle planning and resource allocation.

**Funding Trends:**

- Trends in funding highlight shifts in regional focus and policy changes.
- Insight: Both SEQ and RQ have seen variable funding, with SEQ consistently receiving higher allocations.



## **Ethical Considerations**
**Equitable Distribution of Resources:**

* It is crucial to ensure that resources are distributed equitably across different regions and demographic groups. This includes making deliberate efforts to reach underrepresented and disadvantaged communities.

**Inclusivity and Fair Opportunities:**

* The program should strive to be inclusive, providing fair opportunities for all young entrepreneurs regardless of their background. This involves removing barriers to entry and ensuring that support is accessible to those who need it most.

**Sustainable and Responsible Funding:**

* Funding should be allocated in a way that promotes sustainable business practices and responsible entrepreneurship. This includes supporting projects that have a positive social and environmental impact.

## **Conclusion**
The Young Starters Fund is a crucial initiative that supports young entrepreneurs in Queensland. By fostering innovation, addressing youth unemployment, and building future leaders, the fund contributes significantly to the state's economic and social development. Continued and increased support for the Young Starters Fund will ensure that young people have the resources and opportunities they need to succeed, while adhering to ethical considerations that promote equity, inclusivity, and sustainability.

In [None]:
wri