Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement .stat() function #159

Merged
merged 1 commit into from Jan 30, 2018
Merged

Implement .stat() function #159

merged 1 commit into from Jan 30, 2018

Conversation

hacdias
Copy link
Member

@hacdias hacdias commented Jan 30, 2018

To implement this, I take a look at Go's repo.stat implementation and I basically do the same on each one except on repoSize.

  • For numObjects I use the number of blocks on the Blockstore
  • For repoPath I just used this.path
  • For version I use this.version.get
  • For storageMax I use the config Datastore.StorageMax or the maximum integer number.
  • For repoSize I sum the number of bytes of the blocks and their keys from the Blockstore + the same for the Datastore and the Keystore. This doesn't count with api and ´config` files' size but it won't differ much from the real size.

I also added a simple test.

@ghost ghost assigned hacdias Jan 30, 2018
@ghost ghost added the status/in-progress In progress label Jan 30, 2018
@hacdias hacdias changed the title WIP: Implement .stat() function Implement .stat() function Jan 30, 2018
@hacdias
Copy link
Member Author

hacdias commented Jan 30, 2018

@pgte @dignifiedquire @diasdavid I think this is done and the tests are passing 😄 Could you take a look and if it looks good, merge and publish a new version?

Needed for ipfs/js-ipfs#1197

src/index.js Outdated
* @return {void}
*/
stat (callback) {
waterfall([
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not do the stats in parallel, and aggregate them to an object at the end?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although there are some that use the same variable. Wouldn't that cause some problem eventually?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you require some of them to be series, you can still make the all series, but perhaps use async/series with the object argument to make this less explicit.

Copy link
Contributor

@pgte pgte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see comments.

query (query, callback) {
pull(
store.query(query),
pull.collect((err, list) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can pass callback directly to pull.collect here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! Thanks 😄

src/index.js Outdated
* @return {void}
*/
stat (callback) {
waterfall([
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you're using waterfall to pass the accumulator, but since, by what I gather, each step is creating it's own entry, perhaps you could use async/parallel with the object arguments, collecting them all automatically in the final callback.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple steps use the repoSize. If it was parallel I think I could end up messed up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So perhaps you could calculate that first, and then proceed to make the rest parallel?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah!


function getSize (queryFn, callback) {
pull(
queryFn.query({}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to avoid buffering: I think you can reduce this iteratively instead of buffering the entire list.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment refers to line below where you do pull.collect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, but I didn't get this. How can I do this iteratively?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the query is a pull-stream, you can use pull.reduce like this:

pull(
  source,
  pull.reduce((val) => compute(val), 0, callback))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, didn't know that. Thanks

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! 😄

@hacdias
Copy link
Member Author

hacdias commented Jan 30, 2018

@dignifiedquire @pgte changes made!

src/index.js Outdated
options = Object.assign({}, {human: false}, options)

parallel([
(cb) => cb(null, this.path),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for this, just use it directly at the end

src/index.js Outdated
(cb) => this.version.get(cb),
(cb) => waterfall([
(cb) => cb(null, {
repoSize: new Big(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need, just use the value directly

src/index.js Outdated
stats.repoSize = stats.repoSize.plus(size)
cb(err, stats)
}),
(stats, cb) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the below doesn't need to be a callback

src/index.js Outdated
(cb) => cb(null, {
repoSize: new Big(0)
}),
(stats, cb) => this.blocks.query({}, (err, list) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the size fetching below can all happen in parallel, just need to sum at the end

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@hacdias
Copy link
Member Author

hacdias commented Jan 30, 2018

@dignifiedquire @pgte hopefully everything is nicer now 😃

Copy link
Member

@dignifiedquire dignifiedquire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two more things :)

expect(stats).to.have.property('repoPath')
expect(stats).to.have.property('repoSize')
expect(stats).to.have.property('storageMax')
done()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just realized it now, these don’t test that they are the right numbers, so you should at least check thatj they are greater 0 or similar

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dignifiedquire done! had to move the tests to another file so they run after the other tests add some data to the repo 😄

src/index.js Outdated
})
}),
(cb) => getSize(this.datastore, (err, size) => {
cb(err, new Big(size))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be simplified as getSize already returns a bigint

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

src/index.js Outdated
.plus(results[4])

if (options.human) {
size = size.div(1048576)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use something like https://www.npmjs.com/package/pretty-bytes instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec says MiB so it must be in base 2 and not 10. Unless we could find another package. But why importing another package if we can just divide by that number?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay, the package only makes sense if the output should be dynamically adjusted depending on the size.

dignifiedquire
dignifiedquire previously approved these changes Jan 30, 2018
Copy link
Member

@dignifiedquire dignifiedquire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all my complaints were addressed, looking good

@hacdias
Copy link
Member Author

hacdias commented Jan 30, 2018

Cool! Thanks @dignifiedquire 😄

src/index.js Outdated
}),
(cb) => getSize(this.datastore, cb),
(cb) => getSize(this.keys, cb)
], (err, results) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it would be easier to use the object notation of async/parallel, something like this:

parallel({
  a: (cb) => cb(null, 1),
  b: (cb) => cb(null, 2),
  c: (cb) => cb(null, 3)
}, (err, results) => {
  if  (err) { handleError(err) }
  else {
    console.log(results) // { a: 1, b: 2, c: 3}
  }
})

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it would probably get a bit more clear. I'll do that.

@pgte
Copy link
Contributor

pgte commented Jan 30, 2018

@hacdias except for the async/parallel remark, LGTM!

@hacdias
Copy link
Member Author

hacdias commented Jan 30, 2018

Done and the tests are passing. Could any of you merge this and release a new version when possible? 😃

@daviddias daviddias merged commit bd522ee into master Jan 30, 2018
@daviddias daviddias deleted the stat branch January 30, 2018 19:27
@ghost ghost removed the status/in-progress In progress label Jan 30, 2018
@daviddias
Copy link
Member

v0.18.7 released :)

@hacdias
Copy link
Member Author

hacdias commented Jan 30, 2018

Cool! Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants