ARAMEMNON Scraper

Description

This is a R-based app for scraping gene annotation and transmembrane domain information of a large number of plant genes from ARAMEMNON. To run this app in your computer, you will need to have R installed with the required libraries (see below). 'ARAMEMScraper_app.R' is the core of the app (run this script in R!) and 'ScraperScript.R' contains the source of scraping function, so you will need to put both of these files in the same folder in order to run the app properly in your local environment. Be aware that when you are querying a large number of genes (>100) it will take some time to return a complete list of results. This tool works for model plant Arabidopsis thaliana as well as nine crop plants in the latest version of ARAMEMNON (version 8.1 in July 2024):

You can read more about how the source works and what the output/result can be used for in 'workbook.Rmd'. Example dataset is given in Excel file (Suba4-2021-11-8_1-11.xlsx) and example outputs are also included as txt files.

I created this app for the common good. But if you are looking for the original citation, here it is:

Schwacke R, Schneider A, Van Der Graaff E, Fischer K, Catoni E, Desimone M, Frommer WB, Flügge UI, Kunze R. (2003) ARAMEMNON, a Novel Database for Arabidopsis Integral Membrane Proteins. Plant Physiol. 131: 16-26.

All credit goes to them for making the database public, so cite their paper if you are using this app for your research.

How to run/install

No issue has been found so far when the code is run in RStudio version 2024.04.2+764 and R version 4.3.2 (2024-07-11).

Download the following and put them into the same folder:

Run ARAMEMScraper_app.R in R Gui or RStudio. A browser is then open that looks like this:
Either enter accession IDs into the text field or upload a text file containing all the gene IDs of interest.
Press the submit button. Wait until the result table is generated on the right (see below). Upon submitting a job, a progress bar appears at bottom right to monitor the progress of your query. Waiting time depends on the number of accession numbers you submitted for each job. Do not close the browser while you are waiting!

Required libraries include:

library(shinybusy)
library(shiny)
library(stringr)
library(rvest)
library(tidyr)
library(ShinyJs)
library(pbapply)

To install these libraries, use the following command:

install.packages("xxx") #single library

install.packages("xxx", "yyy", "zzz") #if you are installing multiple packages

where xxx (yyy and zzz) is the name of the library, such as shiny.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Authors

Dr Chun Pong Lee is the primary author. Professor Harvey Millar was my Post-doc supervisor and project manager.

Affiliation: ARC Centre of Excellence in Plant Energy Biology

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
Graphics		Graphics
ARAMEMScraper_app.R		ARAMEMScraper_app.R
ScraperScript.R		ScraperScript.R
Suba4-2021-11-8_1-11.xlsx		Suba4-2021-11-8_1-11.xlsx
readme.md		readme.md
results.txt		results.txt
sample.txt		sample.txt
unknown_proteins.txt		unknown_proteins.txt
workbook.Rmd		workbook.Rmd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ARAMEMNON Scraper

Description

How to run/install

Contributing

Authors

About

Languages

alex-cplee/ARAMEMNON-Scraper

Folders and files

Latest commit

History

Repository files navigation

ARAMEMNON Scraper

Description

How to run/install

Contributing

Authors

About

Topics

Resources

Stars

Watchers

Forks

Languages