Skip to content

ivan-carrera/engproc2023

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Computational Representation of Cellular Lines: A Text Mining Approach

This repository contains the code and related resources for the project focused on the computational representation of cellular lines using text mining techniques.

Overview

The project aims to derive a computational representation of cellular lines by leveraging data from the Cellosaurus database and PubMed. The methodology involves text mining techniques, feature extraction, and the construction of a dendrogram to represent the relationships between various cell lines.

Contents

  • data/: Directory containing raw data files from Cellosaurus and PubMed.
  • dataret.ipynb: Code for extracting data from the databases.
  • dataproc.ipynb: Code for processing retrieved data and constructing the hierarchical representation of cell lines.

Setup

  1. Clone the repository:
    git clone https://github.com/ivan-carrera/engproc2023.git

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages