Skip to content

Sample data and code to accompany published article

Notifications You must be signed in to change notification settings

oxford-mc/patent-mining

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


OxfordLogo

Mining Digital Identity Insights: Patent Analysis using NLP

A repository with reproducibility package to accompany the published article

About The Project

The field of digital identity innovation has grown significantly over the last 30 years, with over 6,000 technology patents registered worldwide. However, many questions remain about who controls and owns our digital identity and intellectual property and, ultimately, where the future of digital identity is heading.

To investigate this further, this research mines digital identity patents and explores core themes such as identity, systems, privacy, security, and emerging fields like blockchain, financial transactions, and biometric technologies. Utilizing natural language processing (NLP) methods, including part-of-speech tagging, clustering, topic classification, noise reduction, and lemmatisation techniques. Finally, the research employs graph modelling and statistical analysis to discern inherent trends and forecast future developments.

The findings significantly contribute to the digital identity landscape, identifying key players, emerging trends, and technological progress. This research serves as a valuable resource for academia and industry stakeholders, aiding in strategic decision-making and investment in emerging technologies and facilitating navigation through the dynamic realm of digital identity technologies.

(back to top)

Getting Started

This is an example of how you may give instructions on setting up your project locally. To get a local copy up and running follow these simple example steps.

Prerequisites

The following prerequistes are required:

  1. Jupyter notebook
  2. Python
  3. Python packages:
    PythonPackages

Usage

In this section we will outline key commands used for the analysis. Note that more comments are available in the notebook itself

Patent Analysis Notebook

  1. Output Patent Leaders
    The following command will output the top 12 entities by patent count over the given dataframe
    output_leaders(wd)
    Leaders
  2. Display Entity
    This code will display patent counts for the entire dataset, period 1996-2020 and post 2020. In addition it will output the first and last patents within the dataframe
    display_entity(wdfull, "Diebold", [], [])
    Display1
  3. Display Entity with Keywords
    This code will display patent count metrics for each of the provided key term search words
    display_entity(wdfull, "Diebold", ["privacy","trust"], [])
    Display2
  4. Display Entity with Additional Ranges
    This code will allow you to specify a subrange e.g. in this case years that Kim Cameron was on staff
    display_entity(wdfull, "Microsoft", ["transaction", "game", "privacy", "trust"], [("Kim Cameron", 1999, 2011, [])])
    Display3
  5. Output Patents for Entity
    This code will output a list of patents with patent hyperlinks by entity name
    print_all_patents(wdfull, "Microsoft")
    Display_Patent

Patent Modeling Notebook

  1. Model Keywords
    This code will output base statistics for a search term including percentage movement, mean, std, error and will also output a chart with trend over time
    selected_words = {'certificate','storage','privacy','communication','encryption'}  # Example set of words
    colors = generate_dark_colors(len(selected_words))
    plot_graph(selected_words, 'TF-IDF','Security',3, -0.00002)
    Stat1 SC3

(back to top)

Contact

Matthew Comb - matthew.comb@linacre.ox.ac.uk

Project Link: https://github.com/oxford-mc/patent-mining

(back to top)

Acknowledgments

(back to top)

About

Sample data and code to accompany published article

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published