GitHub - midorikocak/tree-dataset: read tree data from xml, make it linear and create tree again.

We will be ingesting, transforming, storing, transforming, and displaying some data.

1. First, we'll need data

ImageNet (http://imagenet.stanford.edu/) is a commonly used dataset in machine learning research. We'll be using its taxonomy system.

When you go to http://imagenet.stanford.edu/synset?wnid=n02486410, you'll see a tree on the left. Your job is to get this tree and transform it into a linear form, like this:

[
    {name: 'ImageNet 2011 Fall Release', size: 32326},
    {name: 'ImageNet 2011 Fall Release > plant, flora, plant life', size: 4486},
    {name: 'ImageNet 2011 Fall Release > plant, flora, plant life > phytoplankton', size: 2},
    {name: 'ImageNet 2011 Fall Release > plant, flora, plant life > phytoplankton > planktonic algae', size: 0},
    {name: 'ImageNet 2011 Fall Release > plant, flora, plant life > phytoplankton > diatom', size: 0},
    {name: 'ImageNet 2011 Fall Release > plant, flora, plant life > microflora', size: 0},
    ...
]

We'll use > as a separator of categories/subcategories.

It's completely up to you how you download this data, everything is allowed.

2. Second, we'll need to store it somewhere

Create a database (use any database system you like or want to try) to store these tuples (string, number) and fill it with the data you obtained in the first step.

3. Making sense of it

Next, we'll convert it back to a tree. Like this:

{
    name: 'ImageNet 2011 Fall Release',
    size: 32326,
    children: [
        {
            name: 'plant, flora, plant life',
            size: 4486,
            children: [
                {
                    name: 'phytoplankton',
                    size: 2,
                    children: [
                        ...
                    ]
                },
                ...
            ]
        },
        ...
    ]
}

Can you write an algorithm that will output such a tree? No cheating here, you have to read this data in a linear form from the database.
What is the complexity of your algorithm (in big O notation)?

3. Now it's time to show the data again

Can you design and build an interface to show this data?
Can you implement search in this UI?

Feel free to use any tools, frameworks or libraries. Whatever you are most comfortable with or something new that you wanted to try for a long time. Just let me know what you chose, why, and what was your previous experience with it.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commits
example-data		example-data
functions		functions
public		public
src		src
README.md		README.md
composer.json		composer.json
firebase.json		firebase.json
index.php		index.php
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

example-data

example-data

functions

functions

public

public

src

src

README.md

README.md

composer.json

composer.json

firebase.json

firebase.json

index.php

index.php

package-lock.json

package-lock.json

package.json

package.json

Repository files navigation

1. First, we'll need data

2. Second, we'll need to store it somewhere

3. Making sense of it

3. Now it's time to show the data again

About

Releases

Packages

Languages

midorikocak/tree-dataset

Folders and files

Latest commit

History

Repository files navigation

1. First, we'll need data

2. Second, we'll need to store it somewhere

3. Making sense of it

3. Now it's time to show the data again

About

Resources

Stars

Watchers

Forks

Languages