Skip to content

jacob5412/Web-Mining-Assignments

Repository files navigation

Web Mining Assignments

This repository contains all the web mining assignments for the CSE3024 lab as of Fall'18.

24th July 2018 Exercise 1

  1. Write a program to tokenize
    • A sentence
    • Multiple sentences
  2. Write a program to count the frequency of tokenized words from two sentences.
  3. Write a program (using nltk toolkit in python environment) to tokenize
    • Sentence
    • Multiple sentences
    • A paragraph
    • Information of a complete web page
  4. Write a program to remove the stop words stemming using nltk tool?

31st July 2018 Exercise 2

  1. Write a program to do stop word removal and stemming from a paragraph.
  2. Prepare a table that includes “Word” and “frequency (2 columns).
  3. Print the frequency of words (terms) start with A,B,S,D,E.
  4. Find maximum frequency terms/term.

21st August 2018 Exercise 3

  1. Implement single Topical Crawler(My Spider)
  2. Then multiple crawlers Myspider1 first and then run the Myspider2 multiples times depending (sequentially) on some conditions.

4th September 2018 Exercise 4

  1. Coding Techniques
  2. TF-IDF

11th September 2018 Exercise 5

Estimate the page rank for the given directed graph representing web of six pages and damping factor is 0.9 and Max. Iterations are 2.

  • First find adjacency matrix
  • Find stochastic matrix
  • Find transpose
  • Calculate page rank

18th September 2018 Exercise 6

  1. TF-IDF
  2. Cosine Similarity
  3. Page Rank
  4. Goloumb Coding

10th October 2018 Exercise 7

  1. Naive Bayes
    • Dataset
    • Find accuracy, AUC and confusion matrix
  2. Multinomial Naive Bayes
  3. Apriori
  4. K-means

16th October 2018 Exercise 8

  1. Construction of a directed graph and undirected graph with nodes(plot function)
  2. Colour the edges and nodes(plot)
  3. Name the nodes
  4. Print adjacency matrix of undirected graph
  5. Add few extra nodes to the network and name them as well
  6. Print diameter of the graph
  7. Find degree of all nodes
  8. Find in-degrees of all nodes and out degrees of all nodes
  9. Find density of any nodes
  10. Find closeness centrality of all nodes
  11. Create network from a given data set. You can choose any one of the data sets from the following link
  12. Prepare a histogram of ‘Frequency' vs 'Degree of Vertices'

Digital Assignment 1

  1. Convolution Neural Network (CNN) for Image Recognition
  2. Convolution Neural Network (CNN) for Optical Character Recognition (OCR)

Digital Assignment 2

Deep Neural Network for predicting housing prices

Authors

About

CSE3024: Web Mining

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published