A proof-of-concept M.Tech project for predicting Sensex stock trends using NLP and SVM, built in C# with Visual Studio 2010.
- Purpose: Analyzes news sentiment and technical indicators (EMA) to predict Sensex stock movements.
- Tech: C# (.NET 4.8), OpenNLP (NLP), LibSVM (SVM), SQL Server (data storage), HtmlAgilityPack (web scraping), Twitterizer (Twitter integration).
- Status: Academic prototype; not suitable for real-money investing.
Assemblies/: External DLLs (e.g.,HtmlAgilityPack.dll,OpenNLP.dll).database/:StockPredictData_Schema.sqlfor SQL Server schema.Models/: OpenNLP models (EnglishSD.nbin,EnglishTok.nbin,EnglishPOS.nbin,Parser\tagdict).Properties/: Project metadata and settings.Resources/: UI images.- Root: Source
.csfiles,.csproj,.sln, thesis (MTech Thesis.pdf).
- Prerequisites:
- Visual Studio 2010+ (with .NET 4.8).
- SQL Server (e.g., Express edition) for the
StockPredictDatadatabase.
- Steps:
- Copy external DLLs (
HtmlAgilityPack.dll,Newtonsoft.Json.dll,OpenNLP.dll, etc.) toAssemblies/. - Update
app.configwith your SQL Server instance (e.g.,Server=NEURALLAP\SQLEXPRESS). - Open
SensexPrediction.slnin Visual Studio. - Run
database/StockPredictData_Schema.sqlin SQL Server Management Studio to create the database. - Build the solution to generate
SensexPrediction.exeinbin\Debug\. - Run
SensexPrediction.exe.
- Copy external DLLs (
- External DLLs (in
Assemblies/):HtmlAgilityPack.dll: Web scraping.Newtonsoft.Json.dll: JSON parsing (used by Twitterizer).OpenNLP.dll,SharpEntropy.dll,SharpWordNet.dll: NLP processing.SVM.dll: SVM classification.Twitterizer.dll: Twitter API integration.Microsoft.VisualBasic.PowerPacks.Vs.dll: UI controls (optional, checkSensex_Prediction_Form.cs).
- NLP Models: Included in
Models/(from OpenNLP). - Note: Update
[Miscellaneous].ModelPathin the database if theModels/path changes.
- NLP:
SharpNLP.csuses OpenNLP models fromModels\for feature extraction. - SVM:
SVM.csandSVMMapping.cshandle model training and mapping, saved to[Miscellaneous].ModelPath(e.g.,svm_model.txt). History stored in[SVMModel]table; no filesystem backups confirmed. - Paths: Hardcoded paths updated to relative (e.g.,
Models\,Dump.txt).
MTech Thesis.pdf: Full documentation of methodology and results.
- Relies on deprecated Google Finance URLs.
- Twitter integration uses Twitterizer 2.3.2 (pre-API v2).
- Proof-of-concept only; lacks real-time reliability.
- Helper functions are available under Miscellaneous folder.
#AI #MachineLearning #Finance #Sensex #NLP #Csharp