# Multimodal Disaster Analysis using Social Media
### Ronak Sharma (210108041)
*DA 623 – Computing with Signals*  
*Indian Institute of Technology Guwahati*

## Motivation
In recent years, the frequency and intensity of natural disasters such as floods, earthquakes, and hurricanes have increased. Effective disaster management depends on timely and accurate information. Traditional systems using satellite images and ground sensors suffer from latency and limited coverage. Given the global penetration of social media, it can be a rich, real-time source of multimodal data (text, images, videos, location). This project explores how such multimodal data can enhance disaster response systems.

## Connection to Past and Current Work in Multimodal Learning
Multimodal learning integrates information from different types of data: text, images, audio, location, and video. Earlier works often focused on unimodal pipelines (e.g., only text or only images), which delay decision-making. Advances such as Word2Vec, CNNs, and YOLO, as well as topic modeling techniques like LDA, have allowed systems to analyze and interpret multiple data sources in parallel.

Multimodal approaches are now widely used in domains such as autonomous driving, medical diagnosis, and increasingly, in emergency response through tools like CrisisMMD and MediaEval datasets.

## Learnings from This Work
- Multimodal integration offers faster and context-rich insights.
- Text classification using NLP tools such as Bag-of-Words and Word2Vec enhances situational understanding.
- Object detection using CNNs helps in analyzing infrastructure damage from images.
- Topic modeling (e.g., LDA) can determine the disaster stage (pre-, during, or post-disaster).

## Code Snippets and Demonstration
Here’s a small demo of text preprocessing and vectorization using Word2Vec for disaster tweets.

In [None]:
from gensim.models import Word2Vec
from nltk.tokenize import word_tokenize
import nltk
nltk.download('punkt')

# Example tweets
tweets = [
    "Earthquake destroyed my house. Need help!",
    "Floods in Mumbai again. Streets are underwater.",
    "Rescue teams are helping victims in Assam.",
    "Bridge collapsed due to heavy rain in Kerala."
]

# Tokenization
tokenized = [word_tokenize(tweet.lower()) for tweet in tweets]

# Train Word2Vec model
model = Word2Vec(sentences=tokenized, vector_size=50, window=2, min_count=1, workers=2)

# Vector for a sample word
print(model.wv['earthquake'])

### Image Analysis (Conceptual)
For image data, pre-trained CNNs (e.g., VGG, ResNet) or YOLO can be used to identify damaged structures or crowded zones during disasters. Example pipeline:

```python
# Pseudocode (no actual image files here)
from torchvision import models, transforms
from PIL import Image
import torch

model = models.resnet50(pretrained=True)
model.eval()

# Assume we have a disaster image
img = Image.open('disaster_scene.jpg')
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor()
])
img_t = transform(img).unsqueeze(0)

output = model(img_t)
```

## Reflections
**What surprised me?**
- The diversity and richness of social media data. One post can have location, timestamp, image, and meaningful text.
- Combining simple NLP models and basic CNNs still yields informative outputs.

**Scope for improvement:**
- Incorporating real-time data pipelines.
- Adding video summarization and crowd-sourced validation.
- Integrating with IoT and drone-based surveillance.

## References
- Mikolov et al., 2013: *Efficient Estimation of Word Representations in Vector Space (Word2Vec)*
- YOLO: *You Only Look Once: Unified, Real-Time Object Detection*
- Blei et al., 2003: *Latent Dirichlet Allocation*
- CrisisMMD Dataset
- Gensim, Torchvision, NLTK