Skip to content

Network analysis and machine learning about patent citation networks among firms

Notifications You must be signed in to change notification settings

AidenJiang01/NetworkAnalysis_MachineLearning_PatentCitation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NetworkAnalysis_MachineLearning_PatentCitation

This repository contains a collection of my practice Jupyter notebooks focused on patent citation networks among firms within G03B and G03F IPC categories.

These notebooks form a series that ranges from network creation and exploration to feature generation and storage. And also utilize network-related features for building, training, tuning, and evaluating machine learning and deep learning models. The main theme of this series of notebooks is to emphasize the importance of directly analyzing network data to gain insights about network characteristics and structure, which cannot be obtained solely from attribute data plus machine learning/deep learning.

  1. Database Interaction: Utilizing DB-API and SQL Magic in Jupyter notebooks (MySQL).
  2. Network Analysis: Analyzing global network structures, node centrality, community detection, and network visualization in the IPC category 'G03B' (NetworkX).
  3. Network Analysis: Similar to the above but in IPC category 'G03F'.
  4. Machine Learning: Training ML models to predict network characteristics using preprocessed attribute features (Scikit-Learn).
  5. Deep Learning: Similar to the above but employing neural network models (Keras, scikeras).
  6. Unsupervised Learning: Using dimensionality reduction and clustering algorithms to capture group structures and comparing them to network position classification (Scikit-Learn).

The data files used in and generated by these notebooks are also uploaded to this repository.

  1. Network objects are stored in .graphml format.
  2. Data generated throughout the analysis processes are stored in .csv format.
  3. Original adjacency matrix of the citation networks among firms are stored in worksheets in .xlsx files.

About

Network analysis and machine learning about patent citation networks among firms

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published