<div class="alert alert-block alert-info"><b>IAB303</b> - Data Analytics for Business Insight</div>

# Studio :: The Data Analytics Cycle

---

## Why Data Analytics?

Imagine:
* You've just started a new job
* Marketing manager drops into your office with a problem: The business is looking to launch an agricultural product in either Australia or New Zealand. However, management is unsure which country to start with. Can you help?

Consider:
* How significant is this problem for the business?
* What information do you need to address this concern?
* How do you go about obtaining this information?

### Human Intelligence as a starting point

* How does the Marketing manager describe the problem?
* What does their gut feeling, their intuition tell them?
* At a guess, why do they think this happening?

* The importance of intuition
* The risks of intuition

#### What value is Data Analytics for this scenario?

## Example

Say we want to find out the agricultural, percent of GDP of the countries. The data is located in the data folder, week-2, file `agric_gdp.csv`

* Source: [GapMinder](https://www.gapminder.org/data/)

In [None]:
# Import pandas for dataframes and matplotlib for plotting
import matplotlib.pyplot as plt
import pandas

# Set variables for file and index column
file = "../data/week-2/agric_gdp.csv" #see above
colname = "country" #open the csv and have a look

# Read in the percent of gdp data
ag_gdp = pandas.read_csv(file, index_col= colname)
print(ag_gdp.shape)

In [None]:
# Take a look at the data
ag_gdp

### Making sense

* How do we make meaning of this data?
* What intuitions might we have on the data?
* How do we test these intuitions?
* What do we need to do to make this raw data useful?

In [None]:
# Just select the countries we are interested in by referencing the index
ag_gdp_au = ag_gdp.loc["Australia"]
ag_gdp_nz = ag_gdp.loc["New Zealand"]
print(ag_gdp_au)
print(ag_gdp_nz)

### Not so easy :(

* How long did take you to spot the GDP of each country?
* How easy would it be to compare to more countries?
* What about dozens?
* What is the computer good at?
* What is the computer bad at?
* What are humans good at?
* What are humans bad at?

---

## The Data Analytics Cycle


For this unit, we are concerned with more than just data analytics, we are interested in what is *appropriate, efficous, ethical ...* what is the ***right*** kind of analytics to help provide the ***right*** kind of insights for business.

In doing Data Analytics, we will follow a cycle - **QDAVI** - to address a business concern:

1. **Q**uestion
2. **D**ata
3. **A**nalysis
4. **V**isualisation
5. **I**nsight

<img src="graphics/QDAVI_cycle_sm.png" width="50%" />

### 1. QUESTION

**Concern:** Which country, Australia or New Zealand is most suitable to launch a new agricultural product

> What insights can be extracted from the agricultural percent of GDP of a country?

### 2. Data

Select and load data

In [None]:
# Import pandas for dataframes and matplotlib for plotting
import matplotlib.pyplot as plt
import pandas

# Set variables for file and index column
file = ??? #see above
colname = ??? #open the csv and have a look

# Read in the percent of gdp data
ag_gdp = pandas.read_csv(file, index_col= colname)
print(ag_gdp.shape)

In [None]:
# Take a look at the data
ag_gdp.head(10)

Clean and preprocess the data

In [None]:
# Take the last 5 years of the GDP data
most_recent_five_years = [???, ???, ???, ???, ???]
ag_gdp_cln = ag_gdp.filter(most_recent_five_years, axis=1)
print(ag_gdp_cln.shape)

# Just select the countries we are interested in by referencing the index
ag_gdp_au = ag_gdp_cln.loc[???]
ag_gdp_nz = ag_gdp_cln.loc[???]

In [None]:
# Take a look at the data for AU
ag_gdp_au

In [None]:
# Take a look at the data for NZ
ag_gdp_nz

### 3. Analysis

- What is the problem with the NZ data?
- What do we need to do?
- For now, we don't do any more analysis - we are more interested in the process

### 4. Visualisation

In [None]:
# Plot the 2 countries
plt.plot(???)
plt.plot(???)

In [None]:
# Add labels and set colours
plt.plot(ag_gdp_au,'g-',label="Australia")
plt.plot(ag_gdp_nz,'m-',label="New Zealand")

# Create legend.
plt.legend(loc='upper right')
plt.xlabel("Years")
plt.ylabel("% of GDP")

### 5. Insight

1. What is the concern?
2. What data did we use?
3. How did we analyse it, what decisions and why?
4. What do the visualisations tell us?
5. What is the recommendation for the concern? What other information would be helpful? What *doesn't* the data tell us? Can we make inferences?

---

## The Big Idea: Addressing business concerns through storytelling with information

1. **CONCERN:** The business concern or problem understood in the context of the business and relation to the stakeholders.

2. **DATA ANALYTICS:** Potential sources of information that exist inside or outside of the business or which may be synthesised in order to address a business concern. Techniques and processes and tools which can be utilised in analysing available data for the purposes of addressing a business concern.

4. **MEANING:** Relationships, perspectives, narratives, and understandings that are supported by the data analytics in a way that is meaningful for stakeholders and holds efficacy in addressing a business concern.

### CONCERN

* what kind of problem - is it a business problem?
* who are the stakeholders?
* what is the context?
* business model disruption
* talent management
* global market trends
* foresight
* political risk


#### LEARN MORE

> "If you aren't harnessing the power of data, you're almost certain to end up falling behind."
>
> [The Top Issues CEOs Face These Days (2014)](https://www.wsj.com/articles/executive-leadership-what-are-the-top-issues-ceos-face-these-days-1395267060)

> "Don't ever try and present a technology solution to a business problem"
>
> [Technology Solutions Do Not Always Solve Business Problems](https://youtu.be/J7XAFa4wXgY)


#### LEARN MORE

> "Competitive innovation waits for no one"
>
> [Worst Company Disasters! | Top 6 Blunders](https://youtu.be/T0Z73Zbtlyg) (16 mins)

> "You promised me Mars colonies. Instead, I got Facebook."
>
> [Jason Pontin: Can technology solve our big problems?](https://youtu.be/ZB50BfYlsDc)

### DATA

* external vs internal
* external data for a bigger picture
* industry, consumer, product trends
* needs to be available for decisions
* quality an issue
* governance
* realtime


#### LEARN MORE

> "Only one-third of enterprises currently use information to identify new business opportunities and predict future trends and behavior"
>
> [14 Survey-Based Recommendations on How to Improve Data-Driven Decision-Making](https://bi-survey.com/data-driven-decision-making-business)

> "External data can give you real-time, minute-by-minute updates on industry, consumer, and product trends."
>
> [Why now is the perfect time to go all in on external data analytics](https://www.import.io/post/why-now-is-the-perfect-time-to-go-all-in-on-external-data-analytics/)

#### LEARN MORE

>"external data is one of the biggest blind spots in executive decision making today"
>
> [Outside Insight: Why External Data Is The Fuel Of Tomorrow's Business Success](https://www.forbes.com/sites/bernardmarr/2017/11/15/outside-insight-why-external-data-is-the-fuel-of-tomorrows-business-success/#443d8fa25e1d)

> "There are many free, external data sources posted around the Internet that can, if used well, completely transform our understanding of our market, audience, and the way we do business."
>
> [Free Data Sources to Upgrade Your Business Decision-Making](https://www.sisense.com/blog/free-data-sources-upgrade-business-decision-making/)

### ANALYTICS

* anchored to business value
* pragmatic approach
* test strategies
* invest in data for analytics insights


#### LEARN MORE

> "big data analytics is not trawl fishing. It’s spear fishing"
>
>[Big data analytics should be driven by business needs, not technology](https://www.mckinsey.com/business-functions/mckinsey-analytics/our-insights/big-data-analytics-should-be-driven-by-business-needs-not-technology)

> "So, one good rule of thumb is to always have a clear analytical objective."
> 
> [Here Are The Benefits of Data-Driven Decision Making](https://www.entrepreneur.com/article/280923)

### MEANING

* proactivity
* mitigating risk
* customer experience
* design thinking for human problems

#### LEARN MORE

> "Today, businesses can collect data along every point of the customer journey"
>
>[5 Big Benefits of Data and Analytics for Positive Business Outcomes](http://blogs.teradata.com/data-points/5-big-benefits-data-analytics-positive-business-outcomes/)

> "In design, we build our way forward"
>
> [Want to Make Better Decisions? Know the Difference between Engineering and Design Thinking](https://youtu.be/q7LRxKHdao8) (7 mins)

