In this project, we are working on devising an efficient stratergy for measuring the similarity of web pages based on their DOM structure. We have referred to the following research paper - IEEEResearchPaper
Progress so far :
- Generated DOM tree for given XML/XHTML code
- Obtained node sequence for the given DOM