# What is a hackathon

- a blend of the words "hack" (in the sense of exploratory programming, not computer security breach) and "marathon." 
- an event, typically lasting several days, where people come together to solve problems. 

### **Core Components of a Hackathon:**

1. **Theme/Challenge**: In our case "leveraging LLMs to detect early signs of humanitarian crises from text corpus"

2. **Collaboration**: Participants typically form teams, encouraging collaboration and pooling of diverse skill sets.

3. **Time Limit**: A handful of days.

4. **Innovation**: Often to start from scratch and end with a working prototype or solution.

5. **Presentations**: At the end, teams present their solutions to judges or peers, showcasing their work.

6. **Prizes and Recognition**: Learning, collaboration, and the satisfaction of building something new.

### **What Hackathons Are NOT**:

1. An event to "hack" into systems in an illegal way. The term "hack" here is about building and creating, not breaking into systems.

2. Always about coding. While coding is a big part, brainstorming, designing, and planning are equally important.

### **Why do other people Participate in a Hackathon?**

1. **Learning**: It's a fantastic opportunity to learn new technologies, tools, or methods.

2. **Networking**: Meet people with similar interests or complementary skills.

3. **Building**: Create something new and innovative, even if it's a prototype.

4. **Problem-Solving**: Real-world issues are presented, and you get to brainstorm solutions.

5. **Fun**: Despite the intensity, it's a lot of fun to work on a project in a limited timeframe and see where you get!

# Define and solve your own problem statement through research and collaboration.

### 1. **Clarify the Objective - Early Signs of Humanitarian Crisis**:


1. **Natural Disasters**:
   - **Signs**: Increased chatter or breaking news about severe weather events like hurricanes, tsunamis, earthquakes, or floods. This could include first-person accounts of events unfolding, meteorological reports, or videos/images showing damage.
   - **Example**: Tweets from affected areas about tremors might indicate an earthquake before official news outlets pick it up.


2. **Conflict and Civil Unrest**:
   - **Signs**: Discussions or news about escalating tensions in a region, protests turning violent, or the movement of military troops. Stories or firsthand accounts of violence, displacements, or people fleeing their homes.
   - **Example**: A surge in posts from a city's residents about sudden protests, or images/videos of marches and confrontations with authorities.


3. **Epidemics/Pandemics**:
   - **Signs**: An uptick in mentions about a particular disease, rising numbers of sick individuals, or announcements from health organizations. Discussions about symptoms, hospital overcrowding, or unavailability of medicines.
   - **Example**: A cluster of posts from a region discussing a mysterious illness or flu-like symptoms before official health advisories are released.

4. **Food and Water Crises**:
   - **Signs**: Stories about droughts affecting crop yields, rising prices of essential food items, or contamination of water sources. Personal accounts of hunger, malnutrition, or scarcity of resources.
   - **Example**: Multiple articles or blog posts about failed monsoons and its impact on the year's crop, leading to potential food shortages.

5. **Refugee Movements and Displacements**:
   - **Signs**: Discussions about large groups of people moving away from conflict zones, overcrowded refugee camps, or borders being closed to refugees. Personal stories of leaving homes, struggles in refugee camps, or seeking asylum.
   - **Example**: Posts on community forums from individuals seeking information on safe routes, or pictures on social platforms showcasing the influx of refugees in certain towns or camps.


Features that you could consider engineering includes:

    - rapid increase in frequency (i.e. delta of posts relevant to topic wihtin the last hour) 
    - intensity of related posts (i.e. neg/pos score for sentiment), or 
    - the viral nature of certain stories (# of retweets/sharing), can be key indicators. 
    
**In the meantime,**
_Leveraging LLMs can help in identifying, categorizing, and escalating these signs for timely humanitarian responses._

### 2. **Let's break down the steps**:
For each step you provided, give a more detailed breakdown:

- **Step 0: Scoping**: 
  - Identify the humanitarian signs/topics of interest: natural disasters, war zones, refugee movements, etc.
  - Research where these topics are most frequently discussed: Twitter, news article feeds, Reddit, etc.

- **Step 1: Download/Preproceess Dataset**:
  - What a "good" dataset might look like: size, quality, sources.
      - Structured is king, but unstructured is usually what you would get <- LLM to the rescue.
  - Use API examples, RSS feed parsing, etc. that we're providing.
  
- **Step 2: Analyze Dataset**:
  - Easy analysis: word occurence/frequencies, sentiment analysis, etc. (_feature generation_)
  - Exploratory data analysis (EDA) and Initial Data Analytics (IDA) to be performed
  - Guide them on how to preprocess data for LLMs: 
      - feature generation
      - tokenization, handling of different languages, etc.

- **Step 3: Fit Machine Learning Models**:
  - From linear regression to gradient boosting regression/classification: pick your poison.
  - Fit one first, and go to parks with multiple via AutoML solutions (e.g. PyCaret);
      - Read into evaluation metrics and determine which you should select to determine the best model.


- **Step 4: Final Presentation**:
  - Present in whatever format that fits your solution best: code, video, presentation, report, etc.
  - Discuss how you traversed through the research passage from:
      - explain how you made the scoping decision with existing literature;
      - describe how you downloaded/processed what dataset and why they are relevant; 
      - what analytic techniques you used against the preprocessed data;
      - how did you arrive at the format of your final presentation;
      - how did each major contribute to the overall delivery of the final project;
      
        **Use this same criteria to evalute peer projects.**

### 3. **Simplified Examples**:
Worked-out example of scraping a subreddit and targetting a particular humanitarian issue (bullying) that was mentioned across the scraped information through simple word stats.


### 4. **Daily Check-ins**:
- Milestones:
    0. D1 EOD: Group formed;
    1. D2 EOD: Data Download;
    2. D3 EOD: Preprocessing completed, stats/feature extraction completed;
    3. D4 EOD: Models fitted and initial results;
    4. D5 EOD: Final deliverable consolidation and potential iterations;
- EOD Check-ins on current status: major blockers reviewed with TAs as an summary.

### 5. **Resources**:
- Tutorial on web-scraping
- Worked out example of leveraging one subreddit for simplified humanitarian crisis detection
- OpenAI API (Available upon request)

### 6. **Role Dynamics**:
Are you the researcher, data collector, analyst, presenter, or the 'Project Manager', 'Software developer' etc.

### 7. **Communicate, communicate, communicate.**:
- leverage slack to reach out to us if you have any questions
- don't wait until it's a boiling question

### 8. **Learn from your peers**:
- Talk to your team mates
- You will learn more about other group's scope and prospects today, talk to them where it's applicable.
- You will have a peer feedback session Wednesday for mid-week review.


- Hackathons can be intense and sometimes frustrating. 
- It's a learning process, and the journey is as important as the final product. 
- In the end it's about recognize effort and celebrate small wins.
- This is **open-ended research**.