Darien Nouri DarienNouri

Hi there, I'm Darien Nouri 👋

📚 Completed my undergraduate at New York University in both Data Science and Computer Science

Email: nouri.darien@gmail.com | dan9232@nyu.edu

🔍 Featured Projects

1. Impact of ML Model Complexity on Pairs Trading Optimization

Objective: Investigated the relationship between ML model complexity and forecasting accuracy for stock pair spread movements, with a focus on optimizing trading performance.

Led research investigating the relationship between ML model complexity and forecasting accuracy for stock pair spread movements, with a focus on optimizing trading performance.
Engineered a rich 280-feature set, including Refinitiv API-derived technical indicators, custom-scraped news headlines with BERT-based sentiment analysis, and Bloomberg Twitter sentiment data.
Evaluated model performance across the complexity spectrum: from simple Generalized Linear Models (GLMs) to advanced neural networks, including Gradient Boosting Machines (GBMs), Random Forests (RF), Vanilla LSTM (~50k parameters), Hyperparameter-tuned LSTM (2M+ parameters), and Bidirectional LSTM with dropout.
Developed a backtesting framework incorporating Bayesian-adjusted mean reversion strategies, achieving 80.75% annualized returns with tuned LSTM model—an average 105% improvement over non-deep learning models.
GitHub Repository

2. Urban Dynamics and Real Estate Markets: Enhancing Market Forecasts with Non-Traditional Data

Objective: Explored the predictive power of non-traditional urban data sources in forecasting real estate trends and Real Estate Investment Trust (REIT) performance.

Led research on the efficacy of non-traditional urban data in predicting real estate trends and REIT performance, analyzing datasets such as Citibike usage, building complaints, business operations, health inspections, evictions, etc.
Architected a scalable, cloud-based ecosystem to process and analyze terabytes of diverse datasets, leveraging technologies like MongoDB, AWS services, and distributed computing frameworks such as Dask and PyArrow for efficient data handling.
Engineered high-performance data acquisition systems, including a web scraper for MLS data processing over 1 million listings daily, and a parallel ingestion framework for more than 1 billion historical Citibike rides.
Integrated Granger-causality optimized alternative data sources, improving market index forecasts by 25% and REIT predictions by 34.2%.
Designed an interactive Streamlit application for non-technical stakeholders, allowing intuitive exploration of correlations between alternative data and market trends. (Feature Explorer Demo Render)
GitHub Repository | IEEE Report (Pending Submission)

3. Real Estate Valuator & Market Analysis Frameworks

Objective: Developed a multi-model machine learning framework to valuate residential properties and forecast sale probabilities, integrating data from diverse sources.

Engineered a multi-model ML framework utilizing ensemble and gradient boosting methods for residential property valuation and sale probability forecasting, integrating data from Zillow, Yelp, and other relevant sources.
Developed an interactive Streamlit application that allows users to access real-time property valuations, explore detailed market data, and visualize macroeconomic trends in the real estate sector.
Valuation App Demo | Market Analysis App Demo | Analysis GitHub Repository

4. Web App Deployment Service

Objective: Automated the deployment of Python-based web applications on AWS EC2 instances, streamlining continuous integration and delivery processes.

Developed a scalable and automated deployment system for Python-based web applications (Dash, Streamlit) on AWS EC2 instances, enhancing the efficiency of continuous integration and delivery (CI/CD).
Integrated a GitHub webhook-based deployment system that automatically triggers deployments upon code pushes, ensuring seamless and streamlined deployment of MVPs and new features.
Leveraged PM2 for continuous process management and Nginx as a reverse proxy, providing high availability and load balancing for deployed applications.
Incorporated dynamic Nginx configuration generation based on deployed applications, enabling the dynamic management of custom domains and routing rules.
GitHub Repository

5. Embedding-Based Clustering of Political News Headlines

Objective: Applied NLP techniques to classify the political orientation of news headlines using advanced embedding and clustering methods.

Implemented text preprocessing, feature extraction, and clustering algorithms, including Spectral Clustering and K-Means, to classify the political orientation of news headlines.
Utilized BERT models for text embedding, providing a rich representation of the textual data for clustering purposes.
Explored dimensionality reduction techniques like PCA and t-SNE for effective visualization of high-dimensional data, enhancing the interpretability of clustering results.
GitHub Repository

6. Web Scrapers

Objective: Developed high-performance web scrapers for automating data extraction from various online platforms, supporting various projects and research initiatives.

Zillow Scraper: Built a high-performance web scraper using Azure, capable of extracting over 2 million property listings per scrape on a weekly automated schedule. GitHub Repository
LinkedIn Scraper: Created a scraper for LinkedIn to extract data from both company and user profiles, gathering information such as company posts, user experiences, and education. GitHub Repository
Yelp Scraper: Developed a Selenium-based web scraper to extract restaurant data, including information on health ratings, review counts, and business details. GitHub Repository

92 contributions in private repositories Apr 1 – Apr 16

	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec	Jan	Feb	Mar	Apr
Sun
Mon
Tue
Wed
Thu
Fri
Sat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Darien Nouri DarienNouri

Achievements

Achievements

Highlights

Block or report DarienNouri

Hi there, I'm Darien Nouri 👋

🔍 Featured Projects

1. Impact of ML Model Complexity on Pairs Trading Optimization

2. Urban Dynamics and Real Estate Markets: Enhancing Market Forecasts with Non-Traditional Data

3. Real Estate Valuator & Market Analysis Frameworks

4. Web App Deployment Service

5. Embedding-Based Clustering of Political News Headlines

6. Web Scrapers

Pinned Loading

1,767 contributions in the last year

Contribution activity

April 2025

	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec	Jan	Feb	Mar	Apr
Sun
Mon
Tue
Wed
Thu
Fri
Sat

	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec	Jan	Feb	Mar	Apr
Sun
Mon
Tue
Wed
Thu
Fri
Sat

	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec	Jan	Feb	Mar	Apr
Sun
Mon
Tue
Wed
Thu
Fri
Sat