## 1. Background
### 1.1 Scenario
A group of surfers want to start a business that connects tourists with a 24 hour surfing experience on north or south coasts, involving over night accommodation and transport to a local surf spot. In order to deliver best experience within a fixed cost, they need to balance distance and traffic with weather and wave height. They want to know the best period of the year to run this service, which areas should be used for the overnight accommodation, and whether they can get near realtime data to allow them to decide which beach they are travelling to before  5:00am (their planned pickup time from the hotel).

### 1.2  Interpretation
The first problem for me is what the north or south coasts are. It very hard to tell, it depends on where you are and what the range is. In this notebook, I consider Brisbane as a centre because Brisbane has the most population in Queensland. So, the north coast should be Sunshine Coast and the south coast should be Gold coast. 

For this business, <b>the first important thing</b> is to figure out who our customers are and is it worthy to run this business. <b>Then</b>, I should consider which months of a year are the surfing period. After setting the surfing season, <b>the next step</b> is to get surfing spots forecast data of weather and the wave condition and find the best surfing spot for our customers. <b>when it comes to the accommodation</b>, the basic idea is the location should be in Gold Coast and Sunshine Coast because it is close to the surfing spots and customers do not want to have a long trip before they arrive their surfing spots. <b>I will not consider the traffic condition</b> in this Jupyter notebook because the traffic condition of these two cities is great except for the peak period. Just avoid the peak period in advance and follow the Google map guide.

## 2. Who is our customers?
Before we start our business, we should figure out how big the market is, what kind of people we want to serve.

Currently, there are more than 35 million surfers worldwide and the estimated global surf industry spend is more than $10 billion a year. Australia is renowned as one of the world's premier surfing destinations, so it also attracts a lot of surfers from the world. At the same time, in Australia <b> alone one in every 20 people surf.</b> There are approximately 2.5 million recreational surfers in Australia, 420,000 annual surf participants. We can feel this passion from the post on twitter.

### 2.1 Feeling the surfing passion on twitter

We get the data from the Twitter API and analyse how people think about surfing in Australia 

In [None]:
# import required libraries
import tweepy           # To access and consume Twitter's API
import pandas as pd     # To handle data
import numpy as np      # For number computing
from IPython.display import display
import matplotlib.pyplot as plt
import seaborn as sns
from textblob import TextBlob
import re

# Twitter App access keys

# Consume:
CONSUMER_KEY    = 'kqboNqw0Wid2C2Hq5SjkMgAnL'
CONSUMER_SECRET = '6f9lnniVRW0fAO9qnmWeVrueJvzbMW6omWcIEssyWfHtLBVgqx'

# Access:
ACCESS_TOKEN  = '1109325911944421377-zKKyOUVCvZQszex0gFBfjTNgoCtBey'
ACCESS_SECRET = 'NOnVAYOmDpa1WRD3qfgx1rYH2CjCdKecGJ6ILXUW9goyI'

# API's setup:
def connectToTwitterAPI():
    """
    Utility function to setup the Twitter's API
    with access keys.
    """
    # Authentication and access using keys
    auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
    auth.set_access_token(ACCESS_TOKEN, ACCESS_SECRET)

    # Return API with authentication
    api = tweepy.API(auth)
    return api



In [None]:
# Create an extractor object
extractor = connectToTwitterAPI()

#Search Australia on Twitter
# Specify search criteria and extract tweets into a list
tweets = extractor.search(q="#surfing Australia", lang = "en", count=1000)

# Print the total number of extracted tweets
print("Number of tweets extracted: {}.\n".format(len(tweets)))

In [None]:
# Create a pandas dataframe
data = pd.DataFrame(data=[tweet.text for tweet in tweets], columns=['Tweets'])

# Add relavant data from each tweet
data['len']  = np.array([len(tweet.text) for tweet in tweets]) #textual content legnth
data['ID']   = np.array([tweet.id for tweet in tweets])
data['Date'] = np.array([tweet.created_at for tweet in tweets])
data['Source'] = np.array([tweet.source for tweet in tweets])
data['Likes']  = np.array([tweet.favorite_count for tweet in tweets]) #likes counts
data['RTs']    = np.array([tweet.retweet_count for tweet in tweets]) #retweets count

In [None]:

def cleanTweet(tweet):
    '''
    Utility function to clean the text in a tweet by removing 
    links and special characters using regex.
    '''
    return ' '.join(re.sub("(@[A-Za-z0-9]+)|([^0-9A-Za-z \t])|(\w+:\/\/\S+)", " ", tweet).split())

def analyseSentiment(tweet):
    '''
    Utility function to classify the polarity of a tweet
    using textblob.
    '''
    analysis = TextBlob(cleanTweet(tweet))
    if analysis.sentiment.polarity > 0:
        return 1
    elif analysis.sentiment.polarity == 0:
        return 0
    else:
        return -1

In [None]:
# Compute sentiment for each tweet and add the result into a new column
data['Sentiment'] = np.array([ analyseSentiment(tweet) for tweet in data['Tweets'] ])

# Display the first 10 elements of the dataframe
display(data.head(10))

In [None]:
# Construct lists with classified tweets

positiveTweets = [ tweet for index, tweet in enumerate(data['Tweets']) if data['Sentiment'][index] > 0]
neutralTweets = [ tweet for index, tweet in enumerate(data['Tweets']) if data['Sentiment'][index] == 0]
negativeTweets = [ tweet for index, tweet in enumerate(data['Tweets']) if data['Sentiment'][index] < 0]

# Calculate percentages

positivePercent = len(positiveTweets)*100/len(data['Tweets'])
neutralPercent = len(neutralTweets)*100/len(data['Tweets'])
negativePercent = len(negativeTweets)*100/len(data['Tweets'])

# Print percentages

print("Percentage of positive tweets: {}%".format(positivePercent))
print("Percentage of neutral tweets: {}%".format(neutralPercent))
print("Percentage de negative tweets: {}%".format(negativePercent))

In [None]:

%matplotlib inline

labels = ['Positive', 'Neutral', 'Negative']
sizes = [positivePercent, neutralPercent, negativePercent]

# Set different colors
colors = ['green', 'grey', 'red']

plt.pie(sizes, labels=labels, colors=colors, autopct='%1.1f%%', startangle=140)
plt.axis('equal')
plt.show()

We can know the result of the pie chart above. Most people show a positive attitude to Australia surfing and only less 10% of people show a negative attitude. So, surfing business has a huge market in Australia and people love these sports so much.

### 2.2 Figure out the market
However our business mainly focuses on north and south coasts(Sunshine coast and Gold coast) in Queensland, that means we need to consider whether the population of cities near these two coasts can support our business or not. So, let's do it. 

In [None]:
#!pip install bs4
#!pip install html5lib
!pip install bs4
!pip install html5lib


In [None]:
# import required libraries
# webscrapping
import urllib.request
from bs4 import BeautifulSoup
import html5lib
import string

<b>We get population html from the wekipedia and clean the html data to get we want</b>

In [None]:
Population_list = [] # save the population from different citys
# general function that receives a url, and returns an html page ready to be parsed
def get_HTML(url):
    response = urllib.request.urlopen(url)     # connect to server
    html = response.read()                     # if access is allowed
    return html                                # return html document of the given url
Australia_City_Population_HTML = get_HTML('https://en.wikipedia.org/wiki/List_of_cities_in_Australia_by_population')

soup = BeautifulSoup(Australia_City_Population_HTML, "html.parser")
span_element = soup.find(text='Greater Capital City Statistical Areas/Significant Urban Areas by population')
h2_element = span_element.parent
table_element = h2_element.findNext('table') # a parent tag

for tr_element in table_element.findAll('tr'): # find the city name and population
    City_list=[]
    i=0
    for td_element in tr_element.findAll('td'):
        
        if i==1:
            City_list.append(td_element.a.text)
        if i == 3:
            City_list.append(td_element.text.strip('\n'))
        i+=1
    if City_list != []:   
        Population_list.append(City_list)   

Population_list

After preliminary data cleaning, the data is easy for us to read. However, the data of the population is a string and this data type is not ready for calculation. So, we need to clean further. 

In [None]:
#transfer the population data type from string to int
for city_list in Population_list: 
    city_list[1]=city_list[1].strip('\n')
    city_list[1]=city_list[1].replace(',','')
    city_list[1]=int(city_list[1])
Population_list

Now we have found out the population of different cities in Australia. The next step is to find the population of cities near the Sunshine Coast and the Gold Coast. The main cities near the north coast and south coast are  

                1.Brisbane
                2.Gold Coast
                3.Sunshine Coast
                4.Toowoomba
                5.Bundaberg

In [None]:
##Calculate the population of Brisbane, Gold Coast, Sunshine Coast, Toowoomba, Bundaberg
Near_cities=["Brisbane","Gold Coast","Sunshine Coast","Toowoomba","Bundaberg"]
total_population=0
y=[]
for i in Near_cities:
    for city_list in Population_list:
        if city_list[0]== i:
            print ('The population of '+ city_list[0]+ ' is ' + str(city_list[1]) )
            y.append(city_list[1])
            total_population = total_population + city_list[1]


In [None]:
#Visualize the poulation
#Import the plotting library
import matplotlib.pyplot as plt

#Setup the data

x = ["Brisbane","Gold Coast","Sunshine Coast","Toowoomba","Bundaberg"]
colours = ['red','green','pink','yellow','blue']
#Plot the data
plt.bar(x,y, color=colours)

#Lable the chart
plt.ylabel('population')
plt.xlabel('city')
plt.title('Pupulation of cities near Sunshine Coast and Gold Coast')

print ('The total population of Brisbane, Gold Coast, Sunshine Coast, Toowoomba, Bundaberg is ' + str(total_population) )

population_like_surfing = float(total_population)/20.0
print('The population of liking surfing is about '+ str(int(population_like_surfing))+'(in Australia alone one in every 20 people surf)')


Not too bad!!!

We still have <b>184149</b> persons like surfing. It is worthy to run this business. Actually, we also have many tourists from Australia other places and other countries. The total number should be over <b>200000</b> every year. From the visualization, we also know Brisbane has the most population in these five cities. Although we can know this from the number, the visualization makes the difference become more impressive. 

So we know the capability of the market. But the number 200000 is still a huge number for our business because we just started and we don't have experience.

Let's make the number smaller and find our target customer. This will make our business easier to be operated and it is good for the company which just formed.  

### 2.2 Analysis on market trending and target customer

Firstly, let's do the analysis on the graph below.

Between 2010 and 2014, the number of Aussie women taking part in surfing rose from 218,000 to 258,000, an increase of almost 20%. The number of teenage girls aged 14-17 who surf regularly or occasionally grew from 31,000 to 50,000 over that time, while the number of 18-24-year-old women rose from 46,000 to 59,000. The sport is also experiencing a boom among women aged 50+, 58,000 of whom hit the surf last year, up from 40,000 in 2010 – a 45% increase in participation.

66,000 Aussie men aged 18-24 surfed occasionally or regularly in 2010, that figure has since plummeted to 30,000: almost half the amount of women the same age who surf. Declining participation among teenage boys aged between 14 and 17 means that they too are now outnumbered by their female peers. In contrast, men aged 50+ are taking to the waves in ever-increasing numbers: up from 93,000 in 2010 to 169,000 in 2014.  

After analysis, we know the two most important information. The first is more and more women are engaging the surfing now. The second one is the number of men and women aged 50+ who participate in this sport is increasing dramatically.

It is very obvious that we should our business focus on <b>customers aged 50+ and women surfers</b>.

#### Surfing participation in Australia: 2010 vs 2014

<img src='Surfing-boys-girls-chart.jpg' >

## 3. When and where to run the business?

We should ask ourselves a question before we consider the time to run the business. When does our customer want to go surfing? The answer is very easy. When they are feeling hot. Some people start to feel hot when the <b>temperature</b> arrives around 30 °C. Then the next step is to find months the highest over the 30 °C. At the same time, the <b>water temperature</b> is also very important because it will hurt people when the temperature is less than 25 °C. Another important element need to be considered is the <b>wave height</b>.  

### 3.1 Temperature

In [None]:
#let's view the average weather condition based on month in southern Queensland
import pandas
import matplotlib.pyplot as plt
# Set variables for file and index column
file = 'Avtemp.csv' #see above
colname = 'Month' #open the csv and have a look

# Read 
avtemp = pandas.read_csv(file, index_col= colname)
print(avtemp.shape)
avtemp

In [None]:
#Get the average highest and lowest temperature in a year
month=['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec']
avtemp_mon = avtemp.filter(month, axis=1)
avtemp_mon
avtemp_mon_maixum = avtemp_mon.loc['Highest']
avtemp_mon_minium = avtemp_mon.loc['Lowest']


In [None]:
# Take a look at the data for highest
avtemp_mon_maixum

In [None]:
#Take a look at the data for lowest
avtemp_mon_minium

In [None]:
#Let us see the temperature trending in the whole
# Add labels and set colours
plt.plot(avtemp_mon_minium,'g-',label='Lowest')
plt.plot(avtemp_mon_maixum,'m-',label='Highest')

# Create legend.
plt.legend(loc='upper right')
plt.xlabel('Month')
plt.ylabel('Temperature')

We can easily find temperature trending in the whole year through this visualization. <b>November, December, January, February and March</b> average highest temperatures are over 30 °C. From February the temperature starts to go down. June to August is winter and July is the coldest month in Queensland. 

### 3.2 Water Temperature 

In [None]:
#Here is the data of sea water temperatur near gold coast and sunshine coast
file1 = 'Average sea temp.csv' #see above
colname = 'Sta' #open the csv and have a look

# Read 
avtemp1 = pandas.read_csv(file1, index_col= colname)
print(avtemp1.shape)
avtemp1

In [None]:
#Get the average highest and lowest temperature in a year
month=['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec']
avtemp_mon1 = avtemp1.filter(month, axis=1)
avtemp_mon1
avtemp_mon_max1 = avtemp_mon1.loc['Max C']
avtemp_mon_min1 = avtemp_mon1.loc['Min C']
avtemp_mon_avg1 = avtemp_mon1.loc['Avg C']



In [None]:
#Let us see the water temperature trending in the whole
# Add labels and set colours
plt.plot(avtemp_mon_min1,'g-',label='Lowest')
plt.plot(avtemp_mon_max1,'m-',label='Highest')
plt.plot(avtemp_mon_avg1,'h-',label='Average')
# Create legend.
plt.legend(loc='upper right')
plt.xlabel('Month')
plt.ylabel('Temperature')

We can easily find <b>November, December, January, February and March</b> average highest temperatures are over 30 °C though this visualization. At the same time, the water temperature is also ready for surfing. We also find the temperature and water temperature is very stable and we do not need to worry the temperature changes rapidly. In addition, there is a summer holiday for students in these five months and it also the tourist season. So, we should run our main business in these 5 months. When it comes to which day, it is up to weather condition and wave condition.

### 3.3 Wave Height

The wave height should not over 1.8m because we need to consider our customer safety.

Firstly, let's draw the wave height in the last two years of Gold Coast and Sunshine Coast and we maybe find some regular laws that very important.

There are thousands of records in a year. We can not read them one by one and draw them on the paper. So, we need the technology tool to make it become possible.

In [None]:
# Load the required libraries

# Data Manipulation
import numpy as np
import pandas as pd
#from pandas.tools.plotting import autocorrelation_plot

# Data Visualization 
import matplotlib.pyplot as plt
from matplotlib import pyplot


In [None]:
# load the dataset containing height of different years analyse the structure
df = pd.read_csv( 'gold_coast_2018.csv' )
print( df.head() )

Define a function that can draw wave height in one year

In [None]:
import csv
from datetime import datetime
def drawHeight(filename):
    with open(filename) as f:

        Greader= csv.reader(f)
        header_row=next(Greader)
        Gold_Hs=[]
        Gold_time=[]
        for row in Greader:
            high=float(row[1])
            if high == -99.9:
                    high=1
            Gold_Hs.append(high)
         
            time=datetime.strptime(row[0],"%m/%d/%Y %H:%M")
            Gold_time.append(time)
#Let us draw the wave height trending in the whole
# Add labels
        fig = plt.figure(dpi=128,figsize=(10,6))
        plt.plot(Gold_time,Gold_Hs,'g-',label='Lowest')

# Create legend.

        plt.xlabel('Month')
        plt.ylabel('Significant Height')
    

#### Draw the Gold Coast wave height in 2018

In [None]:
drawHeight('gold_coast_2018.csv')

</br></br></br>
#### Draw the wave height of Gold Coast in 2017

In [None]:
drawHeight('gold_coast_2017.csv')

After Comparing the two graphs of two years, we found the wave height of December and January is stable and perfect for surfing. The wave height in other months is not stable, but there are still many days suitable for surfing during these months. 





</br>
</br>




#### Draw Sunshine Coast wave height in 2018

In [None]:
drawHeight('sunshine_coast_2018.csv')

</br></br></br>
#### Draw wave height of Sunshine Coast in 2017

In [None]:
drawHeight('sunshine_coast_2017.csv')

We can find the wave in the Sunshine Coast was not stable.

The visualization of the wave height makes it easier for us to find rules of the wave.

In conclusion, it is hard to find the law of wave height because it depends on many elements like tide, wind, weather and other sea areas conditions etc. So, It is very important to get the forecast according to real-time data and make the decision.

### 3.4 Get The Forecast

We get and use the forecast data to decide whether we run the business in the next few hours or not and choose which surfing spot is the best place. The data is from the webscpapping and use the BeautifulSoup to parse and analyse the html.

In [None]:
# webscrapping
import urllib.request
from bs4 import BeautifulSoup
import requests

# for regurlar expressions
import re

# image display
from IPython.core.display import HTML 

In [None]:
# specify the surfing spot we want to search for
def get_HTML_forecast( place ):
    # specify a header (if this script returns a ConnectionError exception, just change the name in the header)
    headers = {'User-Agent': 'Catarina\'s_request'}

    place = place.replace(" ", "%20")
    # get the HTML page that contains the results of our search for the specified product
    # you need to combine the url to make quesries on TESCO together with the product that you are searching for
    link = "http://www.swellmap.com/surfing/queensland/"+place+"#tables" 

    # connect to server. If the server returns a code different from 200, it means there was a connection error
    # and it was not possible to connect to the server
    response = requests.get( link, headers = headers )
    if response.status_code != 200:
        raise ConnectionError

    # creates a parse tree that can be used to extract contents from HTML documents, which can be used for web scraping
    soup = BeautifulSoup( response.content, 'html.parser')
    return soup



<b>Input the surfing spots from the next list</b>

Cold Coast:

           1. south-stradbroke-island          
           2. the-spit
           3. narrowneck
           4. burleigh-heads
           5. currumbin-point
           6. greenmount
           
Sunshine Coast:

           1. coolum-beach          
           2. pin-cushion-(maroochydore)
           3. kawana
           4. happys-(caloundra)
           5. alexandria-bay-(noosa)
           6. boiling-pot

In [None]:
SurfingSpot="coolum-beach"  #input the surfing spot
forecast_doc = get_HTML_forecast(SurfingSpot) #get the forecast of the surfing spot you input html  

In [None]:
def get_forecast_table( soup, indx ):  #get the forecast table
    soup_ele = soup.findAll('table', {'class' : 'table table-striped table-condensed table-center' } )[indx]
    return soup_ele

forecast_elem= get_forecast_table(forecast_doc, 0) # Get the forecast table html

Visualize the forecast data

In [None]:

# Load the required libraries

# Data Manipulation
import numpy as np

#from pandas.tools.plotting import autocorrelation_plot

# Data Visualization 
import matplotlib.pyplot as plt
from matplotlib import pyplot

        

In [None]:
#Get the time from the forecast table
def get_time( forecast_elem  ):
    table_bodys_time=[]  
    table_bodys = forecast_elem.findAll("td",{"class" : "time"})
    for td in table_bodys:
        table_bodys_time.append(td.text)
    return table_bodys_time

     

In [None]:
#Get the rating information
def get_rating( forecast_elem  ):  
    table_bodys_rating=[] 
    table_bodys = forecast_elem.findAll("td",{"class" : "rating"})
    for td in table_bodys:
        table_bodys_rating.append(int(td.text))
    return table_bodys_rating


In [None]:
#Get the weather information
def get_summary( forecast_elem  ):
    table_bodys_summary=[]  
    table_bodys = forecast_elem.findAll("td",{"class" : "summary"})
    for td in table_bodys:
        table_bodys_summary.append(td.img['title'])
    return table_bodys_summary 


In [None]:
#Get the swell information
def get_swell( forecast_elem  ):
    
    table_bodys_swell=[]
       
    table_bodys = forecast_elem.findAll("td",{"class" : "hs_sw"})
    for td in table_bodys:
        table_bodys_swell.append(float(td.text))
    return table_bodys_swell


In [None]:
#Get the setface height from the forecast
def get_setface( forecast_elem  ):

    table_bodys_Setface=[]
    table_bodys = forecast_elem.findAll("td",{"class" : "wface"})
    for td in table_bodys:
        table_bodys_Setface.append(float(td.text))
    return table_bodys_Setface


In [None]:
#Get the wave height from the forecast
def get_wave( forecast_elem  ):

    
    table_bodys_wave=[]
    table_bodys = forecast_elem.findAll("td",{"class" : "hs"})
    for td in table_bodys:
        table_bodys_wave.append(float(td.text))
    return table_bodys_wave


In [None]:
#Visualize the height of wave and print the weather condition
def get_forecast( forecast_elem  ):
    time=['4h','10h','16h','22h','28h','34h','40h','46h','52h','58h','64h','70h']
    rating=get_rating( forecast_elem  )
    summary=get_summary( forecast_elem  )
    wave=get_wave( forecast_elem  )
    set_face=get_setface( forecast_elem  )
    swell=get_swell( forecast_elem  )  
    
    plt.plot(time[0:12],wave[0:12],'g-',label='Wave Height')
    plt.plot(time[0:12],set_face[0:12],'m-',label='Set Face')
    plt.plot(time[0:12],swell[0:12],'b-',label='Swell')
    plt.legend(loc='upper right')    
# Create legend.

    plt.xlabel('time')
        
    plt.ylabel('Height  /m')
    plt.title('Wave height of '+SurfingSpot+' in next 70 hours')
    for i in range(12):    #Print the weather condition
        print ("The weather of "+SurfingSpot +" in next "+time[i]+" is "+summary[i])
        
def draw_rating(forecast_elem):      
    time=['4h','10h','16h','22h','28h','34h','40h','46h','52h','58h','64h','70h']
    rating=get_rating( forecast_elem  )
    plt.plot(time[0:12],rating[0:12],'p-',label='Surfing Rating')
    plt.ylabel('Rating')
    plt.xlabel('time')
    
    plt.title('Surfing rating of '+SurfingSpot+' trending in next 70 hours')
get_forecast( forecast_elem  )
###If you find there is no green line on the graph, the green one should have merged with the purple one!!! That means thet have the same value

Now, We know the different wave height trending from the visualize graph and the weather condition. 

Draw the rating trending in next 70 hours

In [None]:
draw_rating(forecast_elem)

Rating between <b>7 and 10</b> indicates<b> good quality</b> surf, based on a long swell period, little or no wind, and swell height greater than a metre. If the wind is quite strong, then it will be an offshore wind.

Rating between<b> 4 and 6</b> indicates<b> reasonable</b> surf conditions, although possibly affected by light-fair onshore winds, and a lower swell.

Rating between <b>1 and 3</b> suggests <b>poor conditions</b> such as strong onshore winds, a low swell period and a small swell height, all of which can make surfing more difficult.


OK! Now, we know the weather condition, the wave height and the surfing rating. We can make the decision whether we run a business or not in the next few hours, how long we should finish and which surfing spot should we choose according to the graph above. The next question is where our customers live overnight.

## 4.How to choose the accommodation?

We use the Tripadvisor website to find the best accommodation for us. We find there is always a best seller tag on the hotel which is the most popular in the region we search. Then we analyse the reasons why the hotel become the most popular and we found next elements:
    
        1. The price is intermediate and most people can afford it
        2. The amenities and the services are good
        3. The location is good. Customers can access to transportation and their destination easily. 
This is perfect for our business because we can provide our customer with great accommodation with good services and transportation in a fixed budget.

<b>Customers make the best decison for us!!!</b>

In [None]:
# get the HTML page that contains the results of our search for the Gold Coast
def get_HTML_GCHOTEL( ):
    # specify a header (if this script returns a ConnectionError exception, just change the name in the header)
    headers = {'User-Agent': 'Catarina\'s_request'}

    
    
    # you need to combine the url to make quesries on TESCO together with the product that you are searching for
    link = "https://www.tripadvisor.com.au/Hotels-g255337-Gold_Coast_Queensland-Hotels.html"

    # connect to server. If the server returns a code different from 200, it means there was a connection error
    # and it was not possible to connect to the server
    response = requests.get( link, headers = headers )
    if response.status_code != 200:
        raise ConnectionError

    # creates a parse tree that can be used to extract contents from HTML documents, which can be used for web scraping
    soup = BeautifulSoup( response.content, 'html.parser')
    return soup



# get the HTML page that contains the hotel results of our search for the Sunbshine Coast
def get_HTML_SSHOTEL( ):
    # specify a header (if this script returns a ConnectionError exception, just change the name in the header)
    headers = {'User-Agent': 'Catarina\'s_request'}


    link = "https://www.tripadvisor.com.au/Hotels-g1132645-Sunshine_Coast_Queensland-Hotels.html"

    # connect to server. If the server returns a code different from 200, it means there was a connection error
    # and it was not possible to connect to the server
    response = requests.get( link, headers = headers )
    if response.status_code != 200:
        raise ConnectionError

    # creates a parse tree that can be used to extract contents from HTML documents, which can be used for web scraping
    soup = BeautifulSoup( response.content, 'html.parser')
    return soup

def get_best_seller_soup( soup ):
    groceries_soup = soup.find(text='Best Seller')
    return groceries_soup



Find the best hotel seller in Gold Coast

In [None]:
hotel_doc = get_HTML_GCHOTEL() #get the html of Gold Coast searching result
Seller_elem= get_best_seller_soup(hotel_doc) #find the best seller of the hotel
hotel_info=Seller_elem.parent.parent.parent.parent.parent
hotel_name_soup = hotel_info.findAll("div", {"class" : "listing_title"})[0]
hotel_name=hotel_name_soup.a.string  #get the hotel name
Hotel_price_soup= hotel_info.findAll("div", {"class" : "price-wrap"})[0]
hotel_price= Hotel_price_soup.div.string    #get the hotel price
print('The best seller of hotel in Gold Coast is '+hotel_name+' and the price is '+hotel_price ) #print the best seller name and price

#Please try few times if it notices failed!!!!!

Find the best hotel seller in Sunshine Coast

In [None]:
hotel_doc = get_HTML_SSHOTEL() #get the html of Sunshine Coast searching result
Seller_elem= get_best_seller_soup(hotel_doc) #find the best seller of the hotel
hotel_info=Seller_elem.parent.parent.parent.parent.parent
hotel_name_soup = hotel_info.findAll("div", {"class" : "listing_title"})[0]
hotel_name=hotel_name_soup.a.string  #get the hotel name
Hotel_price_soup= hotel_info.findAll("div", {"class" : "price-wrap"})[0]
hotel_price= Hotel_price_soup.div.string    #get the hotel price
print('The best seller of hotel in Gold Coast is '+hotel_name+' and the price is '+hotel_price ) #print the best seller name and price

#Please try few times if it notices failed!!!!!

<b>We also want to build a partnership with the bestseller hotels, which can help us saving money if our business is stable and successful.</b> 

##  5. SWOT Analysis

Finally, All the things are done! It is time to start our business and make money now. That is very exciting!!!. But, wait! There is still a very important thing need to do. There is an old saying that knows the enemy and knows yourself, and you can fight a hundred battles with no danger of defeat. If you want to survive and become the biggest player in this industry, you must have a good understanding of yourself and your competitors. So, Let us do the SWOT Analysis.

<img src='SWOT Analysis.png' >


<b>Strengths:</b>

   Agile and Flexible: We just wanted to start our business a few weeks ago. In the beginning, our business structure and working group can change for our customer needs. The failure cost is small because our business scale is much smaller than other big players in the market.
    
   Knowing our target audience: This gives us direction for our marketing and ensures more consistency in our messaging, so we can build stronger relationships with customers and provide a better surfing experience for our customers. We can design special surfing activities at less cost.  
    
   Knowing of the best surfing spot: We have a forecast about surfing spot condition, which can help us to choose the best surfing spot according to our customers and what time is the best.
   
   Knowing of the best seller hotel: The best seller hotel is chosen by customers. So, the position in the market should meet most customers needs. It has a good location, services, amenities and affordable price for most customers, which can help us save the money and delivery good accommodation experience at the same time.
    
    
    
<b>Weaknesses:</b>

Financial crisis: As a small surfing business organization, we do not have enough other budgets for the emergency. As a result, it is very fatal when an emergency happened if we can not get financial support from the sponsor. 

Partnership: In the beginning, we haven't built partnerships with other relevant stakeholders. That means we have more cost than our competitors who has a great relationship with hotels.

Lack of experience: We do not have the experience of running business before. This will be a great challenge for our organization.

Lack of Coaches: Our organization was just formed by some surfers. That means we do not have enough professional coaches to guide our customers and keep their safety.  



<b>Opportunities:</b>

More old surfers: We found there will be more and more old people like surfing in the future and our competitors haven't put their focus on this consumer group.

More foreign tourists: There are more and more tourists from other countries want to experience surfing this kind of spot and they also want to spend more money to get better service. Maybe we can design a special service for foreign tourists in the future and gain money.


<b>Threats:</b>

Global warm: The climate change makes the temperature, weather and large-scale ocean circulation become very unstable. Climate change is very important for us because whether we can run our business depends on our beautiful ocean.

No sponsor: Can not get enough financial support.

Competitors: Other big players want to expand their market share and they have big strong strength and enough resources to use.

Political: If Australia has bad relationships with other countries, it will cause the foreign tourists  decrease. 


<b>Conclusion:</b>

All things start is bad, but after the first pace is stridden,  it will get easier. Our biggest problem now is the financial problem and we do not have any experience of running a business. The first important thing for us is to find a professional person who can guide us on how to run a business. Then, it is time to build relationships with partners from hospitality, restaurant and surfing. We have confidence that we will be successful in the future because we do really want to bring happiness, healthiness and excitement of surfing to our customers.

So what are you waiting for? Contact us now!!! Go grab a board and catch some waves and find out for yourself what the physical, mental and emotional rewards that surfing can give you.